Horde3D • View topic

View unanswered posts | View active topics

Board index » Horde3D Development » Developer Discussion

All times are UTC + 1 hour

NOS PACK

Page 7 of 7

[ 99 posts ]

Go to page Previous 1 ... 3, 4, 5, 6, 7

Print view

Previous topic | Next topic

Author

Message

fullmetalcoder

Post subject: Re: NOS PACK

Posted: 13.04.2009, 19:46

Joined: 11.04.2009, 08:42
Posts: 14
Location: France

marciano wrote:

I have quickly fixed the problem before I saw now that you came up with the same solution

But the particles are also working in the svn version...

Did you commit that to the public (or community if I understood well) SVN already? I have just tried svn up'ing and it tells me my working copy is up to date...

Top

marciano

Post subject: Re: NOS PACK

Posted: 13.04.2009, 20:17

Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217

It's only in the public svn at the moment...

Top

fullmetalcoder

Post subject: Re: NOS PACK

Posted: 13.04.2009, 20:32

Joined: 11.04.2009, 08:42
Posts: 14
Location: France

My bad, I though community == public but it does not appear to be the case. Time to switch repo.

Top

Volker

Post subject: Re: NOS PACK

Posted: 14.04.2009, 07:24

Tool Developer

Joined: 13.11.2007, 11:07
Posts: 1150
Location: Germany

They are both public

, but only the community branch offers write access to some members of the community. The sourceforge SVN allows only read-access.

Top

fullmetalcoder

Post subject: Re: NOS PACK

Posted: 14.04.2009, 12:21

Joined: 11.04.2009, 08:42
Posts: 14
Location: France

Ok repository updated. Alignment now works but there are still issues.

The next one to tackle is fixing the side-effects erupting when using composite data in Vec3f.

Code:

union {
      struct { float x, y, z, w; };
      float c[4]; //should be __m128 c; in SIMD version but this does not affect the issues
   };

Using this as data fields for Vec3f leads to important rendering artifacts in Terrain sample and Knight sample (the later being a simple black screen) but Chicago looks OK. I have not investigated much but given the visuals in Terrain sample this looks like culling/clipping issues.

edit : looks like the issue does not come from composite data fields but from enabling optimizations... At first I thought it came from an interaction between release build / compound fields and aligment but it turns out release mode alone is able to break eveything.

better yet : it looks like removing alignment leads to even worse behavior of the terrain sample (it tends to segfault on start with a vanilla release build).

System : Linux 32 bit, GCC 4.3.3, CMake 2.6-patch 3
Horde3D from SF.net SVN trunk (rev 137)

Backtrace obtained when running Terrain sample in release mode :

Program received signal SIGSEGV, Segmentation fault.
0xb7ed7fad in Horde3DTerrain::TerrainNode::buildBlockInfo () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
Current language: auto; currently asm
(gdb) bt
#0 0xb7ed7fad in Horde3DTerrain::TerrainNode::buildBlockInfo ()
from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#1 0xb7ed8355 in Horde3DTerrain::TerrainNode::createBlockTree ()
from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#2 0xb7ed86a4 in Horde3DTerrain::TerrainNode::TerrainNode () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#3 0xb7ed8762 in Horde3DTerrain::TerrainNode::factoryFunc () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#4 0xb7eb022b in SceneManager::parseNode () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#5 0xb7eb037e in SceneManager::addNodes () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#6 0xb7e75900 in addNodes () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#7 0x0804c26d in Application::init ()
#8 0x0804cd88 in main ()

Top

marciano

Post subject: Re: NOS PACK

Posted: 18.04.2009, 08:25

Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217

Hmm, I have no real idea what could cause the heap corruption on your system. We don't seem to have that problem on Windows.

Top

AcidFaucet

Post subject: Re: NOS PACK

Posted: 19.04.2009, 05:35

Joined: 19.11.2007, 19:35
Posts: 218

Why didn't you try something like this for mult?

It's mult and transpose that are "slow" in a matrix.

Code:

Mat4x4 operator*(const Mat4x4 &o)
   {
      Mat4x4 res;
#if ASM_MATH
      __asm
      {
      mov      eax,[o]
      mov      ecx,[this]
      movups   xmm4,[eax]            // Other.M[0][0-3]
      movups   xmm5,[eax+16]         // Other.M[1][0-3]
      movups   xmm6,[eax+32]         // Other.M[2][0-3]
      movups   xmm7,[eax+48]         // Other.M[3][0-3]
      lea      eax,[res]
      // Begin first row of result.
      movss   xmm0,[ecx]            // M[0][0] 
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      movss   xmm1,[ecx+4]         // M[0][1]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm5
      movss   xmm2,[ecx+8]         // M[0][2]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm6
      addps   xmm1,xmm0            // First row done with xmm0
      movss   xmm3,[ecx+12]         // M[0][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // Begin second row of result.
      movss   xmm0,[ecx+16]         // M[1][0] 
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      addps   xmm3,xmm2            // First row done with xmm2
      movss   xmm2,[ecx+20]         // M[1][1]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm5
      addps   xmm3,xmm1            // First row done with xmm1
      movss   xmm1,[ecx+24]         // M[1][2]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm6
      movups   [eax],xmm3            // Store Result.M[0][0-3]
      // Done computing first row.
      addps   xmm2,xmm0            // Second row done with xmm0
      movss   xmm3,[ecx+28]         // M[1][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // Begin third row of result.
      movss   xmm0,[ecx+32]         // M[2][0] 
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      addps   xmm3,xmm1            // Second row done with xmm1
      movss   xmm1,[ecx+36]         // M[2][1]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm5
      addps   xmm3,xmm2            // Second row done with xmm2
      movss   xmm2,[ecx+40]         // M[2][2]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm6
      movups   [eax+16],xmm3         // Store Result.M[1][0-3]
      // Done computing second row.
      addps   xmm1,xmm0            // Third row done with xmm0
      movss   xmm3,[ecx+44]         // M[2][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // Begin fourth row of result.
      movss   xmm0,[ecx+48]         // M[3][0]
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      addps   xmm3,xmm2            // Third row done with xmm2
      movss   xmm2,[ecx+52]         // M[3][1]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm5
      addps   xmm3,xmm1            // Third row done with xmm1
      movss   xmm1,[ecx+56]         // M[3][2]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm6
      movups   [eax+32],xmm3         // Store Result.M[2][0-3]
      // Done computing third row.
      addps   xmm2,xmm0
      movss   xmm3,[ecx+60]         // M[3][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // stall
      addps   xmm3,xmm1
      // stall
      addps   xmm3,xmm2
      movups   [eax+48],xmm3         // Store Result.M[3][0-3]
      // Done computing fourth row.
      }
#else
      Mat4x4 r;
      r.m[0] = m[0] * o.m[0] + m[4] * o.m[1] + m[8] * o.m[2] + m[12] * o.m[3];
      r.m[1] = m[1] * o.m[0] + m[5] * o.m[1] + m[9] * o.m[2] + m[13] * o.m[3];
      r.m[2] = m[2] * o.m[0] + m[6] * o.m[1] + m[10] * o.m[2] + m[14] * o.m[3];
      r.m[3] = m[3] * o.m[0] + m[7] * o.m[1] + m[11] * o.m[2] + m[15] * o.m[3];
      r.m[4] = m[0] * o.m[4] + m[4] * o.m[5] + m[8] * o.m[6] + m[12] * o.m[7];
      r.m[5] = m[1] * o.m[4] + m[5] * o.m[5] + m[9] * o.m[6] + m[13] * o.m[7];
      r.m[6] = m[2] * o.m[4] + m[6] * o.m[5] + m[10] * o.m[6] + m[14] * o.m[7];
      r.m[7] = m[3] * o.m[4] + m[7] * o.m[5] + m[11] * o.m[6] + m[15] * o.m[7];
      r.m[8] = m[0] * o.m[8] + m[4] * o.m[9] + m[8] * o.m[10] + m[12] * o.m[11];
      r.m[9] = m[1] * o.m[8] + m[5] * o.m[9] + m[9] * o.m[10] + m[13] * o.m[11];
      r.m[10] = m[2] * o.m[8] + m[6] * o.m[9] + m[10] * o.m[10] + m[14] * o.m[11];
      r.m[11] = m[3] * o.m[8] + m[7] * o.m[9] + m[11] * o.m[10] + m[15] * o.m[11];
      r.m[12] = m[0] * o.m[12] + m[4] * o.m[13] + m[8] * o.m[14] + m[12] * o.m[15];
      r.m[13] = m[1] * o.m[12] + m[5] * o.m[13] + m[9] * o.m[14] + m[13] * o.m[15];
      r.m[14] = m[2] * o.m[12] + m[6] * o.m[13] + m[10] * o.m[14] + m[14] * o.m[15];
      r.m[15] = m[3] * o.m[12] + m[7] * o.m[13] + m[11] * o.m[14] + m[15] * o.m[15];
      return r;
#endif

Top

marciano

Post subject: Re: NOS PACK

Posted: 19.04.2009, 08:54

Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217

Inline assembler is no more supported by MSVC for x64 builds. So you should always prefer intrinsics which are also a bit more readable than pure asm.

Top

Siavash

Post subject: Re: NOS PACK

Posted: 19.04.2009, 15:07

Joined: 21.08.2008, 11:44
Posts: 354

By using inline asm, you must write different compatible codes for MSVC, GCC, ... compilers and this makes the code too huge, but by using intrinsics compiler takes care of everything [managing registers, better code optimizations and ...]

Code:

//asm code for msvc and intel c compiler :
dosomething dest, src

//asm code for gcc :
dosomething src, dest

Top

Page 7 of 7

[ 99 posts ]

Go to page Previous 1 ... 3, 4, 5, 6, 7

Board index » Horde3D Development » Developer Discussion

All times are UTC + 1 hour

Who is online

Users browsing this forum: No registered users and 25 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum