Thanks
Quote:
Another tricky issue is signaling shutdown / g_Abort: the while(!g_Abort) could be optimized away. But some kind of memory barrier should help here, at least gcc provides primitives for that. No volatile tricks, please.
A memory barrier isn't needed as the
CPU's ordering of reads/writes from/to the abort flag are unimportant - we just need to stop the
compiler from re-ordering or optimising away the flag.
Making that variable volatile should ensure that the compiler actually reads the value of the flag each iteration.
Unfortunately, newer versions of MSVC actually do insert memory barriers automatically when using volatile variables - as this isn't required in our case it's overkill, but hopefully not a performance-killer.
Quote:
We can implement a threadsafe queue with atomic operations but the pop operation would have to block if there is nothing to do. Implementing that blocking efficiently needs some OS support. The same is true for a waitUntilEmpy function - but here we could at least help processing the work.
I don't like the idea of pop blocking - IMO it should just return false if there is nothing for it to pop. In my example, this would result in a tight loop, so you could write something like this to cause it to block:
Code:
if( g_Work.pop_front( work ) )
{
work.first( work.second );
}
else
{
g_Work.wait_until_not_empty();
}
In my engine, I wouldn't want it to ever block - if the queue becomes empty then I will immediately re-use the thread to perform other tasks.
As for the efficient pausing/signalling/blocking - we can probably 'borrow' some code from GLFW, which implements cross-platform pause/resume.
Quote:
A way around this would be if the user not only creates the threads for us, but also a queue implementation with efficient waiting.
In the example that I gave, the queue is implemented in the utility library, which means the user doesn't have to use it if they don't want to - they're free to use their own implementation.
Quote:
But this again can be more complicated than it sounds: My current implementation supports having independent thread pools, so that fire and forget tasks don't interfere with other sections of the engine were we have to wait for the results.
I don't use this feature at the moment, but it could be useful later. I'll think about more about this.
If the user has their own work-queuing system, they should be able to correctly schedule horde tasks by passing their own function to Horde3D::SetNewTaskCallback. For example, within your callback you could put the tasks into a high-priority queue so they don't end up waiting for F&F tasks.