Horde3D
http://horde3d.org/forums/

Redline Rush
http://horde3d.org/forums/viewtopic.php?f=4&t=1885
Page 1 of 1

Author:  attila [ 20.05.2013, 15:08 ]
Post subject:  Redline Rush

Our new game Redline Rush is available now on the iOS App store. We will publish soon to Google Play and Amazon app store, too.

Of course we are using the excellent horde3d for rendering, bullet is used for physics.

More info about the game: http://www.dogbytegames.com/redline_rush.html

Image
Image

Author:  Volker [ 20.05.2013, 15:34 ]
Post subject:  Re: Redline Rush

Looks awesome

Author:  Siavash [ 20.05.2013, 19:33 ]
Post subject:  Re: Redline Rush

Looks interesting. Do you mind posting some information about target hardware horse power? And which part of engine is slow?

Author:  attila [ 20.05.2013, 20:50 ]
Post subject:  Re: Redline Rush

Siavash wrote:
Looks interesting. Do you mind posting some information about target hardware horse power? And which part of engine is slow?


Mininum hardware requirements : iPhone 3GS+, and iPad 1+.
Posteffects(radial motion blur, glow, coloring, vignette) are enabled on iPad2+, iPhone4s+.
On iOS textures are compressed to pvrtc.

On android es2+ devices are supported. Currently posteffects are disabled, but we may try to support them. (maybe autoenable based on some benchmark at startup)
On android textures are compressed to etc1. Textures with alpha are in rgba8.

We optimized culling and material setup on h3d based on profiling in XCode.
In our pipeline there are separate opaque,alphatest,alphablend render commands. In svn trunk h3d culls the whole scene 3 times, we culls only once, then sorts the queue for each command.
We are using pipeline pipeline stage material, trunk h3d sets uniforms/samplers two times in this case, we modified it to set only once.

I think these optimizations are worth implementing in the official horde3d, but our implementation is still somewhat messy.

more info about changes: http://horde3d.org/forums/viewtopic.php?f=8&t=1737

Author:  Siavash [ 21.05.2013, 04:26 ]
Post subject:  Re: Redline Rush

Sounds like you have really pushed Horde3D to its limits on those devices. I'm wondering if using software occlusion culling will make any difference for you and improve things.

Author:  attila [ 21.05.2013, 05:41 ]
Post subject:  Re: Redline Rush

Siavash wrote:
Sounds like you have really pushed Horde3D to its limits on those devices.

The changes I mentioned were done in 2 weeks. There is still room for improvement, we haven't done any low-level math optimization like in (https://github.com/attilaz/horde3d-x/tree/fastmath) and also h3d currently only uses 1 thread/core for rendering. Paralellization of rendering could help to push the limits forward on more modern cpus with dualcore/quadcore.

Siavash wrote:
I'm wondering if using software occlusion culling will make any difference for you and improve things.


Interesting article, thanks. In this particular project I don't think it would help a lot, as the road is nearly straight, so there is not too much occlusion. We culled objects based on size to avoid rendering small objects.

Author:  Siavash [ 21.05.2013, 07:30 ]
Post subject:  Re: Redline Rush

attila wrote:
Siavash wrote:
Sounds like you have really pushed Horde3D to its limits on those devices.

The changes I mentioned was done in 2 weeks. There is still room for improvement, we haven't done any low-level math optimization like in (https://github.com/attilaz/horde3d-x/tree/fastmath) and also h3d currently only uses 1 thread/core for rendering. Paralellization of rendering could help to push the limits forward on more modern cpus with dualcore/quadcore.

I've not tried fastmath branch on a real world game yet, but there is some small improvements on Chicago sample taking less time to finish animations. During my experiments I constantly compared the changes with original code disassembly here to make sure generated is small and efficient on all hardware (ARM and desktop CPUs) and compilers (ICC, GCC and Clang).

I've also tried to multithread Horde3D using OpenMP and Intel TBB by converting hot spot loops into parallel for loops, but that didn't scaled up very well because of false memory sharing issues and ... . So I guess best option is to leave Horde3D single threaded, and running it in master thread, rendering frame n, while you are simulating physics of frame n+1 in a separate thread.

Regards to SIMD optimizations using SSE or NEON units, using an Array Of Structures data layout cancels out all of gains. There is a massive change required to turn most of data structures to a Structure Of Arrays layout to minimize cache misses and memory load/stores. I believe that will fix the multicore scaling issue too and make more room for optimizations.

Author:  attila [ 21.05.2013, 12:09 ]
Post subject:  Re: Redline Rush

Siavash wrote:
I've not tried fastmath branch on a real world game yet, but there is some small improvements on Chicago sample taking less time to finish animations. During my experiments I constantly compared the changes with original code disassembly here to make sure generated is small and efficient on all hardware (ARM and desktop CPUs) and compilers (ICC, GCC and Clang).

I've also tried to multithread Horde3D using OpenMP and Intel TBB by converting hot spot loops into parallel for loops, but that didn't scaled up very well because of false memory sharing issues and ... . So I guess best option is to leave Horde3D single threaded, and running it in master thread, rendering frame n, while you are simulating physics of frame n+1 in a separate thread.

Regards to SIMD optimizations using SSE or NEON units, using an Array Of Structures data layout cancels out all of gains. There is a massive change required to turn most of data structures to a Structure Of Arrays layout to minimize cache misses and memory load/stores. I believe that will fix the multicore scaling issue too and make more room for optimizations.


I think we need some near-real-world example for CPU profiling/optimizing as the current samples are way too simple for this. On iOS we have approximately 1000 nodes in the scene and issuing 200-300 drawcalls. On a desktop app this numbers could be 5x-10x times more.

Author:  Siavash [ 21.05.2013, 13:53 ]
Post subject:  Re: Redline Rush

attila wrote:
I think we need some near-real-world example for CPU profiling/optimizing as the current samples are way too simple for this. On iOS we have approximately 1000 nodes in the scene and issuing 200-300 drawcalls. On a desktop app this numbers could be 5x-10x times more.

Yes, definitely a mini game to find and profile hot spots inside Horde3D is required. Profiling or testing different rendering backends with current bundled samples is totally pointless.

Author:  mikel [ 18.01.2015, 21:23 ]
Post subject:  Re: Redline Rush

This looks so beautiful. Hopefully I can make something like that (+my ideas) someday. For desktop, because C# for android (xamarin) is too expensive.

Page 1 of 1 All times are UTC + 1 hour
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/