Working on this I realized our artists have switched pretty much all of our meshes to multiple surfaces with different albedo colors instead of using textures (makes it much easier for me).
So using the forum post above - I was able to get it all working fairly easily - I can scan the surfaces, find which one has a face matching the collision point, and read off the color.
For those who may be interested I was losing about 200-300 us on the whole calculation which requires the MeshTool.
But after optimizing so I cache the last mesh and face and surface (which usually doesn't change) this was reduced to about 50us which is quite acceptable. I can have maybe 10-20 of these sensors running at 30Hz without much impact on performance.
We often use simple primitives with solid color - obviously those are much faster and the mesh size will be an issue as well.
So for now I am running well - in the next few weeks I will solve two additional problems:
(1) instead of making an exact point to face match - I'll search nearest best fit face. This will allow me to get a decent color approximation if I use simple collision such as a cube but the visual mesh has small differences. Eventually I think I can solve this exactly by running a second ray trace on the exact collision once I detect a hit on the approximate collision. I keep the simple collision because the objects are part of rigid bodies.
(2) Make it work for a textured mesh (the original goal). If I get this working I'll post some performance numbers here.