Horde3D

Next-Generation Graphics Engine
It is currently 19.03.2024, 12:43

All times are UTC + 1 hour




Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: Optimizing h3dRender
PostPosted: 02.11.2012, 12:01 
Offline

Joined: 26.08.2008, 18:48
Posts: 120
In the past few weeks I tried to optimize the h3dRender() call. We have only static (non-animated) objects so h3d spent most if its time here.

In the end we have a 1.4x - 1.9x speedup on multiple configs (pc with es2 emulator, iphone 3gs and iphone 4) and scenes(500-3000nodes in scene, 100-300 rendercalls, 8-100 different materials used).

Notable changes:
- our pipeline has 3 drawgeometry command(opaque, alphatest, blend).
instead of culling for each drawgeometry command, the new version culls the object for the main frustum only once, then uses this list for collecting/sorting nodes for the rendered queue.
also separeted the collecting of lights and models. each type have a separate array now in the spatialgraph.
culling of objects are done at the start of the h3dRender only once for the main frustum and forward rendered lights. (shadow culling still uses the "old" interface)

- sorting collected objects uses uint instead of float. (I benchmarked float-uint sort: on intel uint compare is only 10% faster, but on armv7 it is 4x faster)
i'm using the fact that ieee 754 formats positive values monotonically, so i'm simply casting positive values to uint. (quite ugly but it works)

- horde3d uses string compares for selecting contexts, uniforms, samplers. i replaced string with an integer id. (benchmarked int(stringID)/string compare: int compare is 8 times
faster than string compare - but this change didn't give as significant speedup as i expected)

- modified setMaterialRec to set materials from an array instead of recursively. As we have lighting parameters in a stage material the trunk version set uniform and samplers twice - once for
the mesh material and than for the stages material. Now parameters are only set only once and parameters are set hierarchically. if param found in mesh's material stops searching,
otherwise searches in light, stage,.. materials. (this was a significant speedup, it seems uniforms set are costly)

- RendererBase::applyRenderStates()
added a static const modifier to uint32 oglBlendFuncs[8] and uint32 oglDepthFuncs[8] to avoid filling the array at each call.
made texture/sampler update more fine-graded. instead of setting always each stage. now set only the dirty texture/sampler stages, (using per stage flag) -
(this was a significant speedup too, but probably depends ony drivers)

Some of the changes are could be part of the official version (int sort, material setup and applyrenderstates), the other changes had caused
some incompatibility: only 32 lights are supported now(because culling result is stored in an uint32 per object), material class support is removed (StringID compare changes)

I you have any questions please feel to ask.
The implementation is still somewhat hacky (but tested and still found no problem), If anyone is interested I share it.

Now I will work on a d3d11 renderererbase. Hope you will find my experiments usefull.


Top
 Profile  
Reply with quote  
 Post subject: Re: Optimizing h3dRender
PostPosted: 05.11.2012, 21:09 
Offline

Joined: 23.07.2009, 21:03
Posts: 51
Location: Germany
Interesting read, as usual :)
Sadly I think most of my app's computational costs originate from different areas than yours, but I'd love to try out your modified setMaterialRec function and report my results, if you are able to share it.

Anyways, thanks for posting your findings.

attila wrote:
- modified setMaterialRec to set materials from an array instead of recursively. As we have lighting parameters in a stage material the trunk version set uniform and samplers twice - once for
the mesh material and than for the stages material. Now parameters are only set only once and parameters are set hierarchically. if param found in mesh's material stops searching,
otherwise searches in light, stage,.. materials. (this was a significant speedup, it seems uniforms set are costly)


Top
 Profile  
Reply with quote  
 Post subject: Re: Optimizing h3dRender
PostPosted: 06.11.2012, 18:21 
Offline

Joined: 26.08.2008, 18:48
Posts: 120
Here is my SetMaterial function. I quickly modified it with removing my other changes, tell me if you have any problem.
It uses the new rendererbase interface for render states(from svn).

Hope it is also faster in your project (or at least not slower).


Attachments:
SetMaterial.cpp [5.98 KiB]
Downloaded 1042 times
Top
 Profile  
Reply with quote  
 Post subject: Re: Optimizing h3dRender
PostPosted: 06.11.2012, 20:06 
Offline

Joined: 23.07.2009, 21:03
Posts: 51
Location: Germany
attila wrote:
Here is my SetMaterial function. I quickly modified it with removing my other changes, tell me if you have any problem.
It uses the new rendererbase interface for render states(from svn).

Hope it is also faster in your project (or at least not slower).


Thanks a lot. I copied it into my project and did a quick test.
Seems like the computational time stays the same for me. Kind of expected since my application is mostly drawcall and gpu bound, so the material impact seems negligible in my app.

But at least not slower, as you said :D


Top
 Profile  
Reply with quote  
 Post subject: Re: Optimizing h3dRender
PostPosted: 08.11.2012, 16:51 
Offline

Joined: 26.08.2008, 18:48
Posts: 120
Roland wrote:
Seems like the computational time stays the same for me. Kind of expected since my application is mostly drawcall and gpu bound, so the material impact seems negligible in my app.


I hoped at least some speedup, but as you said it could help only if your app is cpu limited and you have a great number of material changes(also depends on your driver, I have tested this on ES2 emulator and mobile ES2, maybe the desktop OGL drivers are better optimised). also horde3d does a great job at batch processing nodes if you are rendering the same model/material multiple times. So this could make the speed change negligible even for large number of rendercalls.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group