Horde3D

Next-Generation Graphics Engine
It is currently 28.04.2024, 08:22

All times are UTC + 1 hour




Post new topic Reply to topic  [ 49 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: 05.01.2009, 17:03 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
drawGeometry() calls drawRenderables() which will call any registered rendering function to draw renderable nodes. So if you have no emitters and drawParticles() is still being called then there might be an issue with the (*NodeRenderFunc)()'s.

I'd try turning:
Code:
Modules::sceneMan().registerType( SceneNodeTypes::Emitter, "Emitter",
         EmitterNode::parsingFunc, EmitterNode::factoryFunc, Renderer::drawParticles );


into 0x0 in the place of Renderer::drawParticles, and then seeing if they go away. Which would indicate either a problem in the scenegraph containing bogus emitters or in the registration system.


Top
 Profile  
Reply with quote  
PostPosted: 05.01.2009, 19:31 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
The global renderableQueue is good for sorting purposes, but bad for rendering.

In the case of 70 models and 4 particles it's necessary to iterate through those 70 models when drawing those 4 particle emitters. Yes the loop just calls continue if the type is bogus, but it's still a bit unnecessary. Perhaps, when a node type is registered it also registers a queue for it's type. Then the render functions only need to access the queue of the renderable type, which would be cleared/rebuilt by updateQueues. The renderable queue update has the advantage that it could exploit temporal coherence, the rendering loop however can't.


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 00:13 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
swiftcoder wrote:
abhinavgolas wrote:
As I said, it's just one model, and one anim file, and a simple flat hierarchy of all nodes being children of the Root Node. The issue is that if I'm going to try 100k, I'm going to have to leave my machine for about half an hour before the rendering actually begins, which seems very steep.
With only a single model, that startup time sucks - I have noticed this as well with one of my projects, it seems likely that Horde's current resource system is doing way too much heavy lifting.


I think one reason for the long loading time is that the scene/resource system is trying to find free handles by iterating over the whole scene/resource list. But this can very easily be optimized by adding a freelist where free slots are stored.

@abhinavgolas: Did you profile the loading process? If yes, where is most of the time spent?

@swiftcoder: Is there anything specific you have in mind? I don't think the resource system is overly slow (although there is room for optimization). Texture loading depends on the image type (dds will be even faster), geometry and animations should be quick as plain binary files. We have the few XML files but they should not slow down much since our parser is quite efficient (and the files are small). And if we need faster XML, we can write a non-extractive parser that should even be a lot faster by minimizing memory allocations. Shaders loading/compilation can be relatively slow of course, but that depends on the GL driver and there is nothing that can be done about in GL (D3D would have binary shaders).


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 00:18 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
AcidFaucet wrote:
The global renderableQueue is good for sorting purposes, but bad for rendering.

In the case of 70 models and 4 particles it's necessary to iterate through those 70 models when drawing those 4 particle emitters. Yes the loop just calls continue if the type is bogus, but it's still a bit unnecessary. Perhaps, when a node type is registered it also registers a queue for it's type. Then the render functions only need to access the queue of the renderable type, which would be cleared/rebuilt by updateQueues. The renderable queue update has the advantage that it could exploit temporal coherence, the rendering loop however can't.


Yeah, that can be improved, although different lists are more difficult with the scene node extensions. Also the dynamic shader context and material linking is not the most efficient solution. But sometimes (only sometimes ;) ) that's the price for flexibility. Let's see what can be done there...


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 01:11 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
DOH! :!:
That's why drawParticles comes up, the renderfunc iteration invokes every available render function, so there's an iteration for every single renderable type going across every single node. Didn't catch that until now.

Each species of renderable class could include a static vector< SceneNode* > and the SceneNode could define a "virtual void addToQueue()" and a virtual void "clearQueue()" that the renderable classes override to add themselves to their static queue. Could work for lights too, and avoids any registration funky-doo-wops.


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 02:45 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
Well, I whipped up a quickie.

SceneNode has acquired a virtual void addToQueue(SceneNode *tNode) {}; LightNode, ModelNode, TerrainNode, and EmitterNode all contain static vector<SceneNode*> renderQueue's and implement static void clearQueue(), void addToQueue(SceneNode *tNode) functions, and std::vector<SceneNode*>& getQueue() functions.

Node type registration expanded to take a queue clear function, and a vector fetch function (needed for sorting, lighting, and stuff).

I've made all the necessary modification to the renderer to properly shadow and handle light updates (** To my knowledge **), someone else will probably want to check through it though. There's some map iterations in regions such as sorting bbox.makeJoin() etc that may be able to be handled differently. Some of those may be able to generate a concatenated vector instead of iterating through each set.

Attachment:
RenderPatch.zip [47.58 KiB]
Downloaded 589 times


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 10:46 
Offline
Tool Developer

Joined: 13.11.2007, 11:07
Posts: 1150
Location: Germany
Thanks for this patch. Without looking at the details of you modification, does it make a difference in performance? Doing a quick test on my notebook I couldn't figure out that well. The knight sample seems to be a bit faster (~5fps). The chicago sample with software skinning enabled may be about ~0.5fps faster. With hardware skinning it doesn't make any difference (most likely because this is limited by my GF7700). No difference for the terrain sample as well. But I guess I don't have the right setup for testing the possible improvements of this patch.


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 13:23 
Offline

Joined: 22.11.2007, 17:05
Posts: 707
Location: Boston, MA
Volker wrote:
Thanks for this patch. Without looking at the details of you modification, does it make a difference in performance? Doing a quick test on my notebook I couldn't figure out that well. The knight sample seems to be a bit faster (~5fps). The chicago sample with software skinning enabled may be about ~0.5fps faster. With hardware skinning it doesn't make any difference (most likely because this is limited by my GF7700). No difference for the terrain sample as well. But I guess I don't have the right setup for testing the possible improvements of this patch.
Ja, it slightly hampers testing when none of us even have the possibility of being CPU bound :wink:

_________________
Tristam MacDonald - [swiftcoding]


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 13:44 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
I've noticed it in my minimum frame rates. from 110 region to 121 at the lowest in Chicago, and the Knight never dropping below 600 where before it would be in the 500 range. Since it's basically an optimization for diversity inn scenes, that's where it's going to help.


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 19:33 
Offline

Joined: 02.01.2009, 20:45
Posts: 11
swiftcoder wrote:
I'd try turning:
Code:
Modules::sceneMan().registerType( SceneNodeTypes::Emitter, "Emitter",
         EmitterNode::parsingFunc, EmitterNode::factoryFunc, Renderer::drawParticles );

into 0x0 in the place of Renderer::drawParticles, and then seeing if they go away. Which would indicate either a problem in the scenegraph containing bogus emitters or in the registration system.

@swiftcoder : Thanks for the tip, setting the function pointer to NULL added another bump to the performance

marciano wrote:
@abhinavgolas: Did you profile the loading process? If yes, where is most of the time spent?

@marciano : The majority, in fact all of the time is being spent in AddNodes. Since I have optimizations turned on, it's not showing me the time spent in parseNodes, though I checked that even one interation of addNodes calls multiple recursive calls of parseNodes, which I assume is for each of the model joints. You mentioned that addNodes looks for the next available handle, would it not be straightforward to add a variable containing the next available free handle?

@AcidFaucet : I'm not able to use your patch, I get compilation errors in Renderer.cpp on undefined getRenderableQueue() . I tried fixing it myself, but couldn't, some help please :)


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 20:21 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
In which function(s) (or line #s) does it come up? I may be a chunk out of date with SVN.

Regardless of how many characters I put in the chicago demo, the init time isn't perceivably different. What I do notice is that the crowd simulation eats the framerate. Disabling it once the characters get into position brings framerates to an acceptable level during runtime. Sensible as with large numbers of characters, everything goes crazy. Are you sure that your crowd sim init code isn't a processor eating monster?


Last edited by AcidFaucet on 06.01.2009, 20:50, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 20:40 
Offline

Joined: 02.01.2009, 20:45
Posts: 11
AcidFaucet wrote:
In which function(s) (or line #s) does it come up? I may be a chunk out of date with SVN.

I get an error : class "SceneManager" has no member "getRenderableQueue"
on lines 696,698,805,807,1429,1431,1439,1660,1662,1664,1895,1897,1899
and the same for getLightQueue on lines 1095,1098,1200,1203,1441,1482,1484
all in egRenderer.cpp

I also get an error that MeshNode has no member "setVariable" in egMain.cpp

I'm using version 1.0.0Beta2, so dunno whether things have changed in the SVN checkout. Should I be using that?


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 22:38 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
Whoops my zip is bad, should have been egRenderer.cpp instead of egRendererBase.cpp that was stuck in it.

Meshnode problem is that I've got some other code that enables per object instance variables.
You can just comment out sections containing relevant errors about that ATM. I'll get a clean .patch file against the current version out shortly, in the mean time, here's egRenderer.cpp.
Attachment:
egRenderer.zip [14.55 KiB]
Downloaded 592 times


Top
 Profile  
Reply with quote  
PostPosted: 06.01.2009, 23:14 
Offline

Joined: 02.01.2009, 20:45
Posts: 11
The patch is performing worse than swiftcoder's fix of commenting out particles. I agree that won't work in the general case, but in my case where I only have agents, I'm getting a performance of ~35-40fps using swiftcoder's method, versus ~25-30 fps with the patch, all for 1k agents.


Top
 Profile  
Reply with quote  
PostPosted: 07.01.2009, 01:32 
Offline

Joined: 19.11.2007, 19:35
Posts: 218
Whatever you're profiling with, will it let you dump a text file, take a 'screenshot', or whatever with the information in it, including profiling of your application and its routines?

NOTE: Patch wasn't meant to solve your problem, while looking into your problem I noticed a hang-up for large scale projects, and spent an hour quickly prototyping a possible solution. It makes a difference in scenes with 10 emitters, 100 models, and a terrain; but can definitely be improved and may be the wrong approach.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 49 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 19 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group