Automatic batching is by far not a trivial thing. It can require reordering or merging of index buffers and the generation of texture atlases. Using skinned meshes complicates the matter even more. To draw several characters in a single batch, skeletons would need to be merged which would require updates of the joint index vertex data which is dangerous when done per frame since it could introduce pipeline stalls. And of course there is just a limited number of registers available for the joints.
I think batching should be addressed in a layer above Horde where more domain/application-specific information is available.
Drawcall performance is a D3D9 problem and does not exist to that degree in OpenGL (and D3D10). It is mainly due to the fact that parts of the D3D9 driver are running in kernel mode and other parts in user mode and switching between those is expensive. Of course, there's also some general API overhead in OpenGL and changing states can require a lot of internal reconfiguration in the driver. That's why we should enhance the render state sorting in Horde to help improving performance. On the other hand, if you just sort by state, you could get more overdraw and get fillrate limited since you have less benefit from the early-z rejection mechanisms which works best with geometry sorted by depth. And there is even more things to consider, so all in all it is not trivial
Can you give some more details about your scene (number of static and animated objects, light sources, etc.) and the current performance (FPS for which hardware spec)? Do you think you are rather GPU limited at the moment or is rather the animation, scene management and drawcall overhead the problem?