I have been working with Horde on a scene with medium complexity (30,000 polys, 30 materials, 50 textures and 12 shaders). However my loading times have seemed quite long (over 2 minutes in debug), so I decided to investigate. The vast majority of the time is taken loading textures, so I experimented with adding texture compression. I have found this to be a big win, and allows me to iterate changes much quicker. I thought it might be useful to share what I've found and it would be great to discuss how this could be implemented in the main source.
My current scene contains 50 textures, these are roughly evenly divided between tga and png. This takes just over 90 seconds to load on a debug build and just under 45 seconds on a release build. The majority of the textures are 512x512, the biggest are 2048x1024. The size of the textures on disc is a surprisingly large 100MB. The majority of this data comes from tgas, although more compressed textures like .png in general take longer to load, as it is bound by the decompression process. My system is a 1.66Ghz dual core intel with a 8600 GT graphics card running Vista, however I have tested the same and similar scenes under a variety of machines with similar results, predictably its proportionally faster in line with the processor speed.
I replaced the stbi loading code with the SOIL library which is based on the stbi texture loading code Horde uses, but adds support for .dds texture loading. I also added a script that will compress all the textures referenced in a .dae file and copy them into the correct directory. Direct Draw surface textures are designed for use with DirectX, but are compatible with OpenGL. However the openGL coordinate scheme stores textures bottom to top, rather than top to bottom in the y axis. Compressing the textures to .dds took the data size from 100MB down to 17.5 MB, and this also now contained all mip-maps. Initially I loaded the .dds files and then flipped them at run-time before uploading them. This halved the texture loading time from 45 seconds to 21 seconds on a release build (results under debug were similar). I then took the Nvidia texture compressor and added the ability to flip the textures as they were being compressed. Now loading the textures directly, without flipping them, they loaded in 9 seconds on debug build down from 90 seconds, which makes a huge difference!
At the resource loading stage I hacked in a check for texture resources, which will check for the filename with a .dds extension before trying to load the actual file. To implement this system in a more maintainable way, that could be used in the main branch of Horde, there are a few options:
Create a custom file format at run-time: Horde could load a custom compressed file format (similar to .dds, or just a flipped .dds file) if it exists on disc, otherwise it could create this file whenever it loads any other image type. Whilst this doesn't require an offline tool and is platform independent, it means that loading times will be unpredictable - something that loads in a flash and doesn't even need a loading screen on my machine, will hang for 2 minutes on somebody else's machine as 50 .tgas are loaded, compressed and written to disc.
Just support loading of .dds: Moving over to use SOIL seems like a good idea, doing so will give support for .dds textures. Combining this with a simple run-time flip, can half the loading time of a typical scene. This is the simplest solution, but it gives less than half the possible speed boost available. I also don't know how good support for .dds file creation is on other platforms. Another downside is that in general Artists don't use .dds files - really compression is best supported by the engine not the content.
Add a cross platform texture compression tool: Support an external tool that pre-compresses textures, or add our own. This is quite easy to make using something like SOIL or freeimage. Horde could then either support .dds and a tool that compresses and pre-flips textures, or create a new texture format that was similar to .dds but was already flipped.
If there's a general consensus on how best to implement this change, I would be happy to do so and ideally integrate it into the main branch.
|