360 stereo camera projection

pre · Oct 24, 2023

Loving Godot, but I have a requirement that seems
currently difficult to impossible to do:

Rendering a 3D Stereoscopic 360 degree view of the scene.

Cubemap

We can do a monocular projection by fixing six square 90-degree cameras, one for each axis, rending them each into a texture. Then use a shader loaded with those 6 source-textures to figure out which pixel should be mapped where in the final output.

That's the approach taken by Cykrios with his Godot 360 system to emulate many other kinds of lens too. Amazingly fast considering all the intermediate steps, no obvious lack of detail.

You might think that you could just do two of those: left eye top, right eye bottom,
bish bosh there's your stereo 3d map.

But no.

As explained very well in this amazing page by shubhmehta, what you'll get if you do that is a view from two eyes swiveling in their sockets:

eye-socket rotation

when what we need is a map from a head rotating two eye-cams:

head-rotation

When you look backwards, your eyes are the wrong way round! When you look left your right eye is pointed directly at your left eye.

We definitely need a different kind of projection than
just rotation about the origin, it also needs a position
translation that's different for each column of pixels
in the output image.

Monsterous Camera Rig.

Okay, we can think of extending the method though. Instead of six cameras pointed each along one axis, we can have one camera for every vertical column of pixels on the output image.

Four thousand and ninety six cameras.

Well, for each eye. So.. Eight thousand one hundred and ninety two cameras!

Well, actually you can't cover 180 degrees with one camera, so each column will need two, one pointed up at 45 degrees and one down. So... Sixteen thousand three hundred and eighty four cameras!

Monsterous!

It can be built though. I built a mono-scopic one
and a 4096-pixel wide image filled from 2048 cameras
and it doesn't even crash Godot.

It takes about five minutes for the script to add
though, so you really do think it's crashed.
And then it's like half a minute a frame.

For a 1/16th size model.

Too slow by far. I don't mind a minute per frame in
a batch-render but this is not gonna come close to that,
even with a nearly empty scene.

And also the resulting render is only good for maybe
half the frame. The rest just streaks, I guess memory
issues?

So that ain't gonna work.

Would have been surprised if it did.

Edit Godot!?

I looked at the Godot source-code.
It builds pretty easily.

Camera projection code seems to be mostly in

scene/3d/camera_3d.cpp

And there's a function there called project_ray_origin
that seems like it could maybe be altered to give
that 360 view?

Though that'd still just be a socket-spinning projection
through 360 degrees, not a head-turning projection.

I changed all these files to try and invent a new
camera projection mode PROJECTION_PANORAMA, just as
a copy/paste from the "perspective" functions for now,
to maybe do some experiments on it at least.

	modified:   core/math/projection.cpp
	modified:   core/math/projection.h
	modified:   doc/classes/Camera3D.xml
	modified:   doc/classes/RenderingServer.xml
	modified:   editor/plugins/gizmos/camera_3d_gizmo_plugin.cpp
	modified:   editor/plugins/node_3d_editor_plugin.cpp
	modified:   scene/3d/camera_3d.cpp
	modified:   scene/3d/camera_3d.h
	modified:   scene/main/viewport.h
	modified:   servers/rendering/renderer_scene_cull.cpp
	modified:   servers/rendering/renderer_scene_cull.h
	modified:   servers/rendering/rendering_method.h
	modified:   servers/rendering/rendering_server_default.h
	modified:   servers/rendering_server.cpp
	modified:   servers/rendering_server.h

And while the new projection mode's name does show up
in the editor's panel for the camera, pressing it
makes no actual change.

I guess I missed some place that needs to know about
the extended range of the ProjectionType enum?

I'm very lost, in a very big codebase,
and seems like there's a good chance it'd need more
than just changes to the project_ray functions
anyway because the camera-origin needs to be translated
differently for each screen-X, as well as rotated.

So not even sure it'd be possible even if I got it
to notice my new projection mode.

Forks

@BastiaanOlij off of this forum right here did
a few posts in 2016
about the topic and code for V3 but it's not clear if that
really went anywhere and I can't tell if
their fork
still has any changes like that or what other things
it might have.

I saw someone with a
system to add a 4x4 transform matrix
to the camera's projection properties but I
don't know enough to know if the kind of
mapping needed will fit into a single
transform matrix or not.

Or indeed if that code is ever going to make it into main?

Questions

Is there anyone working on enabling that kind of
camera projection with a plugin or a patch or
something?

If not, is there somewhere I can chip in on a bounty?

If not, is there anyone who can at least reassure
me it's actually possible to do if I keep bashing
my head against the Godot code trying to add a
new projection mode myself?

Or if I just chill for a few months is it coming
in like version 4.5 anyway?

Or will that 4x4 transform matrix actually let you do it?
I suspect it'd have to be a different matrix for
each pixel column but I don't really understand
how the matrix multiplication magic works.

Image

Top-left: Desired stereo projection
Top-Right: Monocular 360 via Cykrios method
Bottom-Left: Monster-Rig, 10 columsn
Bottom-Right: Monster-Rig, 2048 columns

xyz · Oct 24, 2023

pre If you're rendering a single image why do you need a realtime game engine? Maybe use a raytracer instead.

pre · Oct 24, 2023

xyz I am not rendering a single image, I am rendering thousands and thousands of them as the output from my VR project to enable editing those animations in VR.

xyz · Oct 24, 2023

pre Same thing. If the output is not interactive, a raytracer is the way to go as you can cast/project rays per pixel in whatever panorama projection you wish. With 4x4 matrix based projections that realtime engines do, you are limited to linear projections that can only vary per vertex and not per pixel but are typically constant for the whole frame. So the only way to do it will be by patching a number of such projections, which is what cubemap does, as well as your monster-approach. The difference is only in number of pathces.

xyz · Oct 24, 2023

pre On the second though, you could make each such patch the size of a pixel, and still render it with a small number of cameras by sending the per-pixel projection matrix to the shader via 4 pre-rendered textures. This may even run in realtime.

^ Forget it. Not going to work. It'll have to be raytracing.

pre · Oct 25, 2023

xyz I mean it won't be, if I can't make it work in Godot then I'd just go back to Unity where it's working fine.

xyz · Oct 25, 2023

pre Which way you did it in Unity? The multitude of cameras? Whatever you did in Unity can probably be replicated in Godot but if you already have it set up in Unity and you're happy with it why move to Godot - just produce what you need there.

However, raytracing is the optimal approach for solving this problem in general, imho.

pre · Oct 25, 2023

Looking through the Unity code that I got from somewhere too long ago to remember.

Looks like it's basically just calling Camera.RenderToCubemap twice, then doing a convert to equirect.

Confusing. Perhaps it can indeed just be done with 12 squares somehow?

Though they do pass the render function a MonoOrStereoscopicEye parameter. So I guess that make the cube-map renderer head-shape aware?

Can't look at the Unity source code and see what it really does I guess.

xyz · Oct 25, 2023

pre Looks like it's basically just calling Camera.RenderToCubemap twice, then doing a convert to equirect.

Then it does precisely what you said you don't want to do in your initial post, i.e. it rotates each eyeball around its center. So, are the results from this satisfactory in terms of how the stereoscopy looks or not?

pre · Oct 25, 2023

xyz I really think passing in that parameter StereoscopicEye.Right and StereoscopicEye.Left affects what the render does so that it's not just projected from origin the way it is if you pass StereoscopicEye.mono

But I won't have access to test with it till the weekend really.

Seems like it is indeed going to take some hacking on Godot's source-code which is very likely beyond my immediate skills.

Wonder if I'd get a bite if I posted a month's wages on Replit Bounties.

LoipesMas · Oct 25, 2023

I might be wrong, but I'm pretty sure what you're asking for is logically impossible.
In the second gif in your original post, you can see that the position of the cameras changes when you rotate. Because of that, the image that they see is different, so you can't combine those images.
Here's a quick example of what I mean:

On the left image, the green point will be obstructed by the red point, but once the head turns (the right image), the green point will no longer be obstructed.
You would need to have a set of images for each head angle. Maybe you could get away with a finite number of them and do some blending/interpolation, but I'm not sure how well it would work.

xyz · Oct 25, 2023

pre It probably renders each cubemap image as a regular stereo image. The problem is in stitching them together. Because of slightly different camera position in each cubemap face (if we assume it rotates both cameras around the same pivot), the faces won't fit seamlessly. Maybe it just does some edge blending when sampling the cubemap to make final equirectangular projection. But it's hard to tell without seeing the actual results.

pre Seems like it is indeed going to take some hacking on Godot's source-code

Why? What could you possibly hack-in that you can't get from the regular build.

I found this article which describes one-pixel columns approach in Unity. This can be done in Godot without source code interventions.
But this is still an approximation compared to what a raytracer would produce. So again. Spare yourself a trouble and do it the easy way with a raytracer. Use the right tool for the job.

Also why do you need environment maps? If you already have a scene that can be rendered by a game engine, just run it in real time.

xyz · Oct 25, 2023

LoipesMas You would need to have a set of images for each head angle. Maybe you could get away with a finite number of them and do some blending/interpolation, but I'm not sure how well it would work.

Yeah that's exactly what the OP is proposing. It could be done with a programmable ray tracer because you can alter the view ray origin for each rendered pixel. With realtime rendering your rays are pretty much set in stone for the entire rendering pass. So you'll need to stitch a large number of small images, each rendered from a slightly different camera position/angle. Ideally, those images should be only one pixel in size.

Looks like vray already has a built-in option to render stereo cubemaps. Not sure if Blender's Cycles could be harnessed to do it without too much hassle.

pre · Oct 26, 2023

xyz Intersting article. Could maybe do something like that in godot if all else fails. Though presumably a C++ function built into the engine like Unity has there would be faster.

The project is a VR project. How are you gonna do VR in a ray-tracer at 90 frames a second?

Then when things are edited, the user may render to a 360 stereo video for upload to a VR video-hosting platform, that's what the output-to-cubemap function is for.

I guess output to fbx for import into blender could be the proposal here? But I don't think my target users are going to be able to do things like that.

We'll know more about what the Unity render is doing when I can experiment with it more tomorrow.

xyz · Oct 26, 2023

pre The project is a VR project. How are you gonna do VR in a ray-tracer at 90 frames a second?

Ok. So you need it to run in realtime and capture it into 360 stereo video? You should have mentioned that in the first post. Then the pixel columns approach probably won't be fast enough as your rendering shaders would need to process the whole scene vertex data as many times per frame as there are columns.

The likely bottleneck here is on GPU side, so doing it in native code wouldn't make much difference. The whole feature may not really be feasible in realtime, except maybe for very light (vertex-wise) scenes. But I could be wrong in this estimate. Go and implement it using viewports, shaders and GDScript, then profile it and see where the bottlenecks are. If they are indeed on the CPU side then you can think of porting what you have into native code. On the other hand, if they are on the GPU side, there's not much you can do about it other than try a different approach.

Do you know if somebody else managed to implement this in a project or a commercial product?

pre · Oct 26, 2023

I need the editing to run in real-time, but the export to 3d-video doesn't need to be real-time.

I am not aware of anyone trying to do what I am trying to do, but what I have been doing in Unity until a couple of months ago when I decided to try Godot was working. It has made all the movies at starshipsd.com by essentially acting the parts in VR, editing the results, and then rendering to 360/3d-video upon the export button being pressed.

xyz · Oct 26, 2023

pre I need the editing to run in real-time, but the export to 3d-video doesn't need to be real-time.

Then it should be doable.

Aaxolotl · Oct 26, 2023

LoipesMas

You’re technically correct. However in practice it turns out that the blend OP is proposing (for each angle, render only one vertical slice directly ahead of the camera) works well enough. It’s what the Cardboard Camera app does, and in fact this is how all 3D 360 images work today.

pre · Oct 27, 2023

So here's the difference between Unity rendering an equirectangular output for Mono vs Left.-Eye.
Mono at the bottom.

You can see that the edging on the floor gets closer to the camera in the Left version, presumably because the origin is offset by my exaggerated 0.5m pupil-distance.

The technique in that article seems pretty similar to the monsterous system I described, but instead of setting up eight thousand cameras they move one camera around and call a render() function for it in each position. Which I guess would be easier on memory.

Neither Godot's Camera3D nor Viewport class seem to have a render function. But you could always just do one pixel-column per game-frame or something I guess. Spin the cameras and fill in the texture. Maybe the frame-rate can get really high if you're only rendering a 1x4096 pixel frame.

xyz

pre · Oct 27, 2023

Ah, yes, there is indeed no way to make a force-render of a viewport currently

https://github.com/godotengine/godot-proposals/issues/1010