Horde3D

Next-Generation Graphics Engine
It is currently 26.11.2024, 07:35

All times are UTC + 1 hour




Post new topic Reply to topic  [ 99 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7
Author Message
 Post subject: Re: NOS PACK
PostPosted: 13.04.2009, 19:46 
Offline

Joined: 11.04.2009, 08:42
Posts: 14
Location: France
marciano wrote:
I have quickly fixed the problem before I saw now that you came up with the same solution :)
But the particles are also working in the svn version...

Did you commit that to the public (or community if I understood well) SVN already? I have just tried svn up'ing and it tells me my working copy is up to date...


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 13.04.2009, 20:17 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
It's only in the public svn at the moment...


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 13.04.2009, 20:32 
Offline

Joined: 11.04.2009, 08:42
Posts: 14
Location: France
My bad, I though community == public but it does not appear to be the case. Time to switch repo.


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 14.04.2009, 07:24 
Offline
Tool Developer

Joined: 13.11.2007, 11:07
Posts: 1150
Location: Germany
They are both public :) , but only the community branch offers write access to some members of the community. The sourceforge SVN allows only read-access.


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 14.04.2009, 12:21 
Offline

Joined: 11.04.2009, 08:42
Posts: 14
Location: France
Ok repository updated. Alignment now works but there are still issues.

The next one to tackle is fixing the side-effects erupting when using composite data in Vec3f.

Code:
union {
      struct { float x, y, z, w; };
      float c[4]; //should be __m128 c; in SIMD version but this does not affect the issues
   };


Using this as data fields for Vec3f leads to important rendering artifacts in Terrain sample and Knight sample (the later being a simple black screen) but Chicago looks OK. I have not investigated much but given the visuals in Terrain sample this looks like culling/clipping issues.

edit : looks like the issue does not come from composite data fields but from enabling optimizations... At first I thought it came from an interaction between release build / compound fields and aligment but it turns out release mode alone is able to break eveything.

better yet : it looks like removing alignment leads to even worse behavior of the terrain sample (it tends to segfault on start with a vanilla release build).

System : Linux 32 bit, GCC 4.3.3, CMake 2.6-patch 3
Horde3D from SF.net SVN trunk (rev 137)

Backtrace obtained when running Terrain sample in release mode :

Program received signal SIGSEGV, Segmentation fault.
0xb7ed7fad in Horde3DTerrain::TerrainNode::buildBlockInfo () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
Current language: auto; currently asm
(gdb) bt
#0 0xb7ed7fad in Horde3DTerrain::TerrainNode::buildBlockInfo ()
from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#1 0xb7ed8355 in Horde3DTerrain::TerrainNode::createBlockTree ()
from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#2 0xb7ed86a4 in Horde3DTerrain::TerrainNode::TerrainNode () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#3 0xb7ed8762 in Horde3DTerrain::TerrainNode::factoryFunc () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#4 0xb7eb022b in SceneManager::parseNode () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#5 0xb7eb037e in SceneManager::addNodes () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#6 0xb7e75900 in addNodes () from /home/prog/horde3d-trunk/build/Horde3D/Source/Horde3DEngine/libHorde3D.so
#7 0x0804c26d in Application::init ()
#8 0x0804cd88 in main ()


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 18.04.2009, 08:25 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
Hmm, I have no real idea what could cause the heap corruption on your system. We don't seem to have that problem on Windows.


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 19.04.2009, 05:35 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
Why didn't you try something like this for mult?

It's mult and transpose that are "slow" in a matrix.

Code:
Mat4x4 operator*(const Mat4x4 &o)
   {
      Mat4x4 res;
#if ASM_MATH
      __asm
      {
      mov      eax,[o]
      mov      ecx,[this]
      movups   xmm4,[eax]            // Other.M[0][0-3]
      movups   xmm5,[eax+16]         // Other.M[1][0-3]
      movups   xmm6,[eax+32]         // Other.M[2][0-3]
      movups   xmm7,[eax+48]         // Other.M[3][0-3]
      lea      eax,[res]
      // Begin first row of result.
      movss   xmm0,[ecx]            // M[0][0]
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      movss   xmm1,[ecx+4]         // M[0][1]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm5
      movss   xmm2,[ecx+8]         // M[0][2]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm6
      addps   xmm1,xmm0            // First row done with xmm0
      movss   xmm3,[ecx+12]         // M[0][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // Begin second row of result.
      movss   xmm0,[ecx+16]         // M[1][0]
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      addps   xmm3,xmm2            // First row done with xmm2
      movss   xmm2,[ecx+20]         // M[1][1]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm5
      addps   xmm3,xmm1            // First row done with xmm1
      movss   xmm1,[ecx+24]         // M[1][2]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm6
      movups   [eax],xmm3            // Store Result.M[0][0-3]
      // Done computing first row.
      addps   xmm2,xmm0            // Second row done with xmm0
      movss   xmm3,[ecx+28]         // M[1][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // Begin third row of result.
      movss   xmm0,[ecx+32]         // M[2][0]
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      addps   xmm3,xmm1            // Second row done with xmm1
      movss   xmm1,[ecx+36]         // M[2][1]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm5
      addps   xmm3,xmm2            // Second row done with xmm2
      movss   xmm2,[ecx+40]         // M[2][2]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm6
      movups   [eax+16],xmm3         // Store Result.M[1][0-3]
      // Done computing second row.
      addps   xmm1,xmm0            // Third row done with xmm0
      movss   xmm3,[ecx+44]         // M[2][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // Begin fourth row of result.
      movss   xmm0,[ecx+48]         // M[3][0]
      shufps   xmm0,xmm0,0
      mulps   xmm0,xmm4
      addps   xmm3,xmm2            // Third row done with xmm2
      movss   xmm2,[ecx+52]         // M[3][1]
      shufps   xmm2,xmm2,0
      mulps   xmm2,xmm5
      addps   xmm3,xmm1            // Third row done with xmm1
      movss   xmm1,[ecx+56]         // M[3][2]
      shufps   xmm1,xmm1,0
      mulps   xmm1,xmm6
      movups   [eax+32],xmm3         // Store Result.M[2][0-3]
      // Done computing third row.
      addps   xmm2,xmm0
      movss   xmm3,[ecx+60]         // M[3][3]
      shufps   xmm3,xmm3,0
      mulps   xmm3,xmm7
      // stall
      addps   xmm3,xmm1
      // stall
      addps   xmm3,xmm2
      movups   [eax+48],xmm3         // Store Result.M[3][0-3]
      // Done computing fourth row.
      }
#else
      Mat4x4 r;
      r.m[0] = m[0] * o.m[0] + m[4] * o.m[1] + m[8] * o.m[2] + m[12] * o.m[3];
      r.m[1] = m[1] * o.m[0] + m[5] * o.m[1] + m[9] * o.m[2] + m[13] * o.m[3];
      r.m[2] = m[2] * o.m[0] + m[6] * o.m[1] + m[10] * o.m[2] + m[14] * o.m[3];
      r.m[3] = m[3] * o.m[0] + m[7] * o.m[1] + m[11] * o.m[2] + m[15] * o.m[3];
      r.m[4] = m[0] * o.m[4] + m[4] * o.m[5] + m[8] * o.m[6] + m[12] * o.m[7];
      r.m[5] = m[1] * o.m[4] + m[5] * o.m[5] + m[9] * o.m[6] + m[13] * o.m[7];
      r.m[6] = m[2] * o.m[4] + m[6] * o.m[5] + m[10] * o.m[6] + m[14] * o.m[7];
      r.m[7] = m[3] * o.m[4] + m[7] * o.m[5] + m[11] * o.m[6] + m[15] * o.m[7];
      r.m[8] = m[0] * o.m[8] + m[4] * o.m[9] + m[8] * o.m[10] + m[12] * o.m[11];
      r.m[9] = m[1] * o.m[8] + m[5] * o.m[9] + m[9] * o.m[10] + m[13] * o.m[11];
      r.m[10] = m[2] * o.m[8] + m[6] * o.m[9] + m[10] * o.m[10] + m[14] * o.m[11];
      r.m[11] = m[3] * o.m[8] + m[7] * o.m[9] + m[11] * o.m[10] + m[15] * o.m[11];
      r.m[12] = m[0] * o.m[12] + m[4] * o.m[13] + m[8] * o.m[14] + m[12] * o.m[15];
      r.m[13] = m[1] * o.m[12] + m[5] * o.m[13] + m[9] * o.m[14] + m[13] * o.m[15];
      r.m[14] = m[2] * o.m[12] + m[6] * o.m[13] + m[10] * o.m[14] + m[14] * o.m[15];
      r.m[15] = m[3] * o.m[12] + m[7] * o.m[13] + m[11] * o.m[14] + m[15] * o.m[15];
      return r;
#endif


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 19.04.2009, 08:54 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
Inline assembler is no more supported by MSVC for x64 builds. So you should always prefer intrinsics which are also a bit more readable than pure asm.


Top
 Profile  
Reply with quote  
 Post subject: Re: NOS PACK
PostPosted: 19.04.2009, 15:07 
Offline

Joined: 21.08.2008, 11:44
Posts: 354
By using inline asm, you must write different compatible codes for MSVC, GCC, ... compilers and this makes the code too huge, but by using intrinsics compiler takes care of everything [managing registers, better code optimizations and ...]
Code:
//asm code for msvc and intel c compiler :
dosomething dest, src

//asm code for gcc :
dosomething src, dest


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 99 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 19 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group