Horde3D

Next-Generation Graphics Engine
It is currently 25.11.2024, 21:34

All times are UTC + 1 hour




Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: Performance
PostPosted: 15.01.2009, 19:33 
Offline

Joined: 14.01.2009, 18:27
Posts: 3
Hello,

As this is my first post maybe it would be best to introduce myself. I'm Alexander...a fellow programmer that just found about horde a few days ago! I didn't have much time to actually look at the engine so I've just been crunching the forums and my first impression is pretty good! You guys have a very nice community here with a clear / reasonable set of goals which is great!

However it seems that you have lots of performance problems with horde. One would guess that a project that claims to be a next generation graphics engine with focus on crowd rendering would be very fast and lightweight. But it doesn't seem to be the case.

What I'm interested in is the performance problems you guys have with the engine...from what I can tell from the forums you have problems with memory consumption, memory access problems, utilising multiple cores, not utilising the vector unit very well and stuff like that. All good points for optimisation I think. But still, these are just general observations.

I would greatly appreciate if someone familiar with these concepts and the engine can give me a more detailed overview of the problems you are experiencing. I may be able to help in some way to solve a part of the problems. I'm really looking forward into joining the developer effort and since I'm not planning to use horde for anything at the moment I thought this could be a great way to get familiar with the engine and contribute to the project.

I'm not sure if this is the most appropriate place for a post like this. Hope that I didn't frustrated anyone :)

Best wishes,
Alexander


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 16.01.2009, 02:00 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
Hello Alexander,
thanks for offering your help with optimization.
shaderdev wrote:
However it seems that you have lots of performance problems with horde. One would guess that a project that claims to be a next generation graphics engine with focus on crowd rendering would be very fast and lightweight. But it doesn't seem to be the case.

I think saying it like this is a bit exaggerated, I don't think that we have a performance problem. The current performance of Horde is not bad but it is still far from what can be considered optimal. The memory problems you are talking about happen when you throw in a hundred thousand objects, which is honestly a quite crazy amount. But of course our goal is to make things as optimal as possible, so any help is very appreciated. I think there is no specific area which needs special consideration. Everthing can be made faster, the high level scene graph processing and render queue handling as well as low level vector math. So if you want to help, the best is probably to start your profiler and see what eats most time...


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 16.01.2009, 03:04 
Offline

Joined: 08.11.2006, 03:10
Posts: 384
Location: Australia
marciano wrote:
Hello Alexander,
thanks for offering your help with optimization.
shaderdev wrote:
However it seems that you have lots of performance problems with horde. One would guess that a project that claims to be a next generation graphics engine with focus on crowd rendering would be very fast and lightweight. But it doesn't seem to be the case.

I think saying it like this is a bit exaggerated, I don't think that we have a performance problem.
Yeah, in most cases that I've seen, the bottleneck of Horde applications is not the CPU, but probably the shaders on the GPU ;)

That said, there are some parts of the renderer that could be optimised for GPU usage:
1) Forward rendering could be modified to allow 'grouping' of multiple lights into one render pass
---e.g. create a spotlight shader that has uniforms for 2, 3 or more spot-lights at once.
--- This would make the lighting shaders a lot more complex (the permutations can be managed with proper use of functions and includes), but it could drastically reduce the number of passes required when using lots of lights.
2) Deferred rendering could be modified to allow light-volumes to be defined by a mesh/model (instead of rendering lights as a screen-space quad), drastically reducing the number of pixels to be shaded. Also, the stencil buffer could be used to again reduce the number of pixels shaded by a large amount.


I just noticed that #1 might reduce CPU usage as well (less scene traversals?).


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 16.01.2009, 07:06 
Offline

Joined: 21.08.2008, 11:44
Posts: 354
Alexander wrote:
What I'm interested in is the performance problems you guys have with the engine...from what I can tell from the forums you have problems with memory consumption, memory access problems, utilising multiple cores, not utilising the vector unit very well and stuff like that. All good points for optimisation I think. But still, these are just general observations.
IMHO we couldn't perform any noticeable vectorizations [SSE] on the main body of engine because it's too simple and this makes it too fast. But by vectorizing the vector math lib of engine [utMath.h] we could gain a great performance [at least 40%], because it's shared between all parts of engine. I've performed some optimizations on utMath here but there is some problems. Currently I can't compile the engine + vectorized utMath with MSVC but there isn't any problems with MinGW/GCC and unfortunately the compiled code with MinGW/GCC wouldn't run.

About multithreading I think it's better to perform it @ game engine level to parallel the audio, net, ai and ...

Thanks for help dear Alexander :wink:


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 17.01.2009, 17:09 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
DarkAngel wrote:
2) Deferred rendering could be modified to allow light-volumes to be defined by a mesh/model (instead of rendering lights as a screen-space quad), drastically reducing the number of pixels to be shaded. Also, the stencil buffer could be used to again reduce the number of pixels shaded by a large amount.

I would rather go into that direction since I think deferred/partly deferred rendering will have a bright future.


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 18.01.2009, 16:47 
Offline

Joined: 14.01.2009, 18:27
Posts: 3
marciano wrote:
Hello Alexander,
thanks for offering your help with optimization.
shaderdev wrote:
However it seems that you have lots of performance problems with horde. One would guess that a project that claims to be a next generation graphics engine with focus on crowd rendering would be very fast and lightweight. But it doesn't seem to be the case.

I think saying it like this is a bit exaggerated, I don't think that we have a performance problem. The current performance of Horde is not bad but it is still far from what can be considered optimal. The memory problems you are talking about happen when you throw in a hundred thousand objects, which is honestly a quite crazy amount. But of course our goal is to make things as optimal as possible, so any help is very appreciated. I think there is no specific area which needs special consideration. Everthing can be made faster, the high level scene graph processing and render queue handling as well as low level vector math. So if you want to help, the best is probably to start your profiler and see what eats most time...


Hello,

Excuse me if the above sounded a bit more exaggerated, it wasn't what i had in mind. And yeah definately it would be best if i just run the profiler and see what turns out. The thing was that i had some problems with my pc's lately and i just managed to get vista on my mobile (desktop probably burned or something...i don't have time to deal with it atm.). What i wanted to know was something more about horde's performance in the mean time, because i'm generaly interested in performance optimizations and hardware. But now that i have a running pc i will be able to actualy look at the code and do some captures. The only thing that concernes me is the opengl support in vista...i'm not sure if the engine will even run. Actualy i don't have any experience with how opengl applications perform on vista. But we will see...

About changing the shaders to support multiple lights in one pass...well this is a must! I wouldn't even consider an implementation that does one light per pass (or anything different than all lights per pass :) ). Not sure why you decided to take this approach...

I'm a bit rough on using opengl and glsl...is there any tool for static analisis / scheduling of the output from the glsl compiler like nvshaderperf for cg. The last time i used opengl on win32 was 2-3 years ago...

Greetings,
Alexander


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 18.01.2009, 17:59 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
shaderdev wrote:
The only thing that concernes me is the opengl support in vista...i'm not sure if the engine will even run.

My working machine has Vista 64 installed, so no worries. Driver performance used to be rather bad in the first days but that seems to have improved a lot, especially for NVidia.

shaderdev wrote:
About changing the shaders to support multiple lights in one pass...well this is a must! I wouldn't even consider an implementation that does one light per pass (or anything different than all lights per pass :) ). Not sure why you decided to take this approach...

Well, think about it...the multiple lights approach brings up many problems of which the most obvious are (assuming up to 4 lights per pass): shader permutations (how do you handle 1, 2, 3 or 4 lights in the shader), heavily increased memory usage (you need up to 4 shadow maps at the same time if you don't do something special like a deferred shadow mask pass) and texture sampler limitations (if your have complex lights with cookies they can exceed the 16 stages). And finally, I'm more a deferred shading fan...

shaderdev wrote:
I'm a bit rough on using opengl and glsl...is there any tool for static analisis / scheduling of the output from the glsl compiler like nvshaderperf for cg.

nvshaderperf supports GLSL, although I never tried it. There is also gDEBugger for OpenGL profiling and debugging.


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 18.01.2009, 18:21 
Offline

Joined: 14.01.2009, 18:27
Posts: 3
Quote:
My working machine has Vista 64 installed, so no worries. Driver performance used to be rather bad in the first days but that seems to have improved a lot, especially for NVidia.


Yeah just tested the samples and they work...however i now have the ati driver problem...not sure why it's not fixed yet.
I don't get any rendering of the skinned meshes on a Mobility Radeon HD3470, 8.12 drivers...annoying! Guess there is no solution to this?

Quote:
Well, think about it...the multiple lights approach brings up many problems of which the most obvious are (assuming up to 4 lights per pass): shader permutations (how do you handle 1, 2, 3 or 4 lights in the shader), heavily increased memory usage (you need up to 4 shadow maps at the same time if you don't do something special like a deferred shadow mask pass) and texture sampler limitations (if your have complex lights with cookies they can exceed the 16 stages). And finally, I'm more a deferred shading fan...


Well yeah...actualy this is a problem if you want to support shadow maps within the engine, but this problem persists with nearly all graphics effects available. By adding new effects natively in the engine you increase the number of permutations and eventualy you reach a point where you can't add any new effects or need to sacrifice performance with a more data driven shader system. I personaly think that a graphics engine should just expose the required data and leave the user do whatever he wants with it. That's because it's much more easy for the user to work around the permutations than the engine. For example i may need for my application just a single shadow pass from a single light source. Then i won't have any problem with complex / multi-pass shaders.

Quote:
nvshaderperf supports GLSL, although I never tried it. There is also gDEBugger for OpenGL profiling and debugging.


I didn't know it supports a GLSL compiler...i only used it with CG. That's great. gDEBugger is not a very good tool unfortunately, but i guess that's the maximum you can get from the driver... the PS3 has the best profiling tools!


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 18.01.2009, 18:45 
Offline

Joined: 02.12.2008, 21:25
Posts: 8
Have you used the patches suggested in here? If you've got a beefy GPU they should help considerably in getting better fps.


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 18.01.2009, 23:44 
Offline
Engine Developer

Joined: 10.09.2006, 15:52
Posts: 1217
shaderdev wrote:
Quote:
My working machine has Vista 64 installed, so no worries. Driver performance used to be rather bad in the first days but that seems to have improved a lot, especially for NVidia.


Yeah just tested the samples and they work...however i now have the ati driver problem...not sure why it's not fixed yet.
I don't get any rendering of the skinned meshes on a Mobility Radeon HD3470, 8.12 drivers...annoying! Guess there is no solution to this?

Ahh, that's really annoying, I thought that would be all fixed. Can someone else confirm that the problem still exists on some ATI hardware, even with the latest drivers?


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance
PostPosted: 19.01.2009, 16:35 
Offline

Joined: 23.06.2008, 10:23
Posts: 21
Location: Sweden
marciano wrote:
shaderdev wrote:
Quote:
My working machine has Vista 64 installed, so no worries. Driver performance used to be rather bad in the first days but that seems to have improved a lot, especially for NVidia.


Yeah just tested the samples and they work...however i now have the ati driver problem...not sure why it's not fixed yet.
I don't get any rendering of the skinned meshes on a Mobility Radeon HD3470, 8.12 drivers...annoying! Guess there is no solution to this?

Ahh, that's really annoying, I thought that would be all fixed. Can someone else confirm that the problem still exists on some ATI hardware, even with the latest drivers?


The samples render just fine on my 4850 with 8.12 drivers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 12 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group