LODding Terrain for 3D in Godot

Pixophir · Jun 23, 2022

Yawn, another terrain thing ? Don't we have enough of those ? And aren't these days even over ?

Well, probably. But then again, simulation games have a certain appeal and some guys need their hobbies. So, I give it a try. I have made an OpenGL copycat of CDLOD which allows rendering of terrain heightmaps with high performance.

Will show off more soon(tm)once I have a Vulkan version.

Megalomaniak · Jun 23, 2022

Actually a robust and flexible terrain plugin would be great to have!

Pixophir · Jun 23, 2022

Megalomaniak

That's motivating, thanks ! But takes time.

Replying to your comment in the welcome thread:

The engine is probably not the problem, but the gpu (nerfed fp64 performance on consumer grade graphics cards). Afaik we can have multiple vulkan instances, or at least devices, so the terrain part will certainly do it's own things and interface somehow with the engine.
After the geometry stage, when the camera position has been deducted, things can go their normal way down the pipeline. So my rough plan is to do a module that brings it's own environment, which may go as far as also doing very fast frustum checks for the render selection of terrain nodes, maybe even collision detection at least for medium and large range, or high speed when there's a long swept volume. You know, the guys allways want to go at ludicrous speed (and brake for noone) ;-)

Need to define more clearly what the interface could like once I know more details.

Pixophir · Jun 27, 2022

Ok, can show off preliminary results, and a description of how things work. Would very much like to get realistic estimations if that can be reasonably well integrated into the Godot structure.
A disclaimer: this has not grown on my dung ! Credits must be given to the inventor F. Strugar (see first post), who was even so friendly to offer a reference implementation under the MIT license. Here's the theory.

Won't repeat the paper, so just a superficial (how fitful :-)) overview: based on a heightmap image (e.g. 16 bit grayscale png for low storage usage) a quad tree is built which each level being half the size of the previous. Each node of the quadtree though has the same size in terms of triangles to render. This way, a single square mesh (in my case limited to power of two alongside, reason for this comes later down the line) can be used to render each of the nodes. The transition between positions is calculated over an area, so that positions move gradually between lod levels.

Image data from SRTM V3 90m spatial resolution, converted from ASCIII to png, just to have something

Each frame, a selection of nodes from the quad tree to render this frame is being made. The selection only depends on the camera position and the distance to the node in question. Thus, the transition areas move with the camera. Selection is done by a fast frustum check against a bounding box of the node in question.

All data necessary for rendering is calculated during loading of the heightmap image. In the example, a 2048*2048 png this takes around half a second. A larger heightmap (8 or 16k) take several seconds, and the resulting data structure has considerable size (several MB), so this will certainly have an impact on overall performance then because it fits in no cache any more. But smaller sizes (2 or 4k) can easily be rendered on less powerful devices than my PC. Frame rate single threaded, not overclocked, AMD 5800X, Radeon RX 6700XT.

Momentarily, it is till rather buggy. Once it runs sufficiently well I will put it on github.

Most of the parameters can theoretically be changed on the fly, maybe requiring rebuilding of the data structure. This at least for a limited size of the landscape, think a bunch of worlds of floating discs. But even a planet sized object could hypothetically be represented, with the help of a large data structure (e.g. spherical/ellipsoid cube map with six quad trees) and intense streaming.

Paremeters:
static const int NUMBER_OF_LOD_LEVELS{ 5 };
self explanatory, the depth of the quadtree.

static const int MAX_NUMBER_SELECTED_NODES{1024};
The selection should be limited to a size that can be handled easily during a frame and not to bust the stack. This should play together with quad tree node sizes, resolution, and resulting number of render calls.

static const int LEAF_NODE_SIZE{64};
The size of a leaf node. Together with size of the heightmal and number of LOD levels this determines the size inb memory of the quad tree, and the available space for LOD transitions. Shouldn't be too small or popping artefacts appear and the data structure gets really big.

static const omath::uvec2 TILE_SIZE{2048,2048};
The size of a heightmap image.

static const float LOD_LEVEL_DISTANCE_RATIO{2.5f};
The space between each LOD level get greater with the distance from the camera. In order to keep the number of triangles to render roughly equal, this is a tweak to change 'density' more to the foreground, or to the background.

static const float MORPH_START_RATIO{ 0.7f };
Morphing in Strugar's terms refers to the gradual between LOD levels. This means that 70% of a node remains at its resolution, 30% are used for 'morphing' between positions.

static const int RENDER_GRID_RESULUTION_MULT{1};
Simply a multiplier to increase resolution on large or spacey heightmaps.

static const int GRIDMESH_DIMENSION{ LEAF_NODE_SIZE * RENDER_GRID_RESULUTION_MULT };
Clear, right ?

static const bool SHADOW_MAP_HIGH_QUALITY{false};
static const int SHADOW_MAP_RESOLUTION{SHADOW_MAP_HIGH_QUALITY ? 4096 : 2048 };
Ideally, a shadow map's cascades follow the LOD level transitions. I have not implemented that yet.

Sorry, must cut and run now, love to hear from you what you think about the feasibility. A lot of conceptual things are still missing, like how to render it nicely, and how to do physics with it.

cybereality · Jun 27, 2022

That's pretty nice.

Pixophir · Jun 27, 2022

Come on, we can speak freely. Know what valgrind just told me ? "Go fix your program !"
Was an uninitialized pointer in the selection loop, called many many times per frame roll eyes

Seriously, all remarks welcome if its worth going on with. I can stand it. You probably know the technique. Can be extended to real 3D as well.

Pixophir · Aug 11, 2022

Because multiple requests :-) there shall be wings ... err caves. Well, evolution has wings for cave-dwellers too :-)

Certainly I will need tunnels, too. But I must see how this all fits into the Godot structure. A cave or any building is probably a scene change anyway, so there will be some transition between what's inside and what's outside. Will also have to see how to integrate that thing into Godot.

I haven't told so, my goal is continent or small planet size terrain rendering. The renderer, as planned, needs many posts to render its levels, as a level has a stable range where detail is fixed, a transitional area where positions are morphed to the next level, and then the next level with half the resolution and so on. So, it really shines when there are huge open spaces to fly along or travel through.

Of course, the same technique could be used for a small terrain, a valley maybe, and a high density of posts.

Tomcat · Aug 11, 2022

Pixophir A cave or any building is probably a scene change anyway, so there will be some transition between what's inside and what's outside.

Would it help?

Pixophir · Aug 11, 2022

Certainly any help or suggestions are welcome ! But I can't guarantee anything

Haystack · Aug 24, 2022

This is my jam. Nicely done.

Pixophir · Aug 27, 2022

Currently:

The transition between two lod levels is done continuously, and if there isn't enough depth available then cracks show up. This is either because the visibility distance is too small, or because there are not enough raster posts available to perform the transition.

Take a look at the screen shot, you can see a lod level transition right in the front where two grid lines from the lowest level merge to the next highest. This goes over >10 posts, enough for a smooth, gradual transition, on a patch of terrain of 4096 by 4096. That was an old version.

Extreme case, assume a raster size of 8 by 8 and three lod levels. No spacial data structure and selection algorithm needed for such a simple thing. Let's say for simplicity the raster size coincides with world distance units, so each post is a meter from the next. Of course in the end these will be distinct things, with conversions in between, trivial on a flat world, challenging on a sphere or even ellipsoid. But back to the example. It is easy to show that 3 lod levels will have a hard time.

The first (highest, the most far away) level would go to a little less than a half of the distance (4 posts), the second (nearer) to a little less than a third. Even if we allow the whole depth of a level to be used for continuous lodding, between the second and the third (closest) there'd only be 1 post or two or even less to perform the actual lodding, which results in cracks in some cases. We could (and actually do) apply a multiplier for the grid, to insert posts linearly between two adjacent posts, but that doesn't solve the problem, and looks strange on a very structured terrain.

Solutions coming up :-)

In the end, world distance units and raster posts are separate things. In the first version one can set a linear factor in x and z to scale the raster to world distances. Like, you have a post every meter, or every ten meters, or so. Configuration must be so that the lodding can actually be performed, or at least a warning is shown when to many levels are being squeezed in too little depth of view. Eventually, camera view distance must be separated from lodding distance, though the lodding is performed based on camera view distance alone. So that sounds like a contradiction.

I will certainly have to write a huge help document for this because there's more.

Pixophir · Sep 2, 2022

Question for the cracks: pointers or handles ?

    unsigned int m_x;
unsigned int m_z;
unsigned int m_level;
unsigned int m_size;
omath::vec2 m_min_max_height;

node *m_tl{nullptr};
node *m_tr{nullptr};
node *m_bl{nullptr};
node *m_br{nullptr};

So this is the current node struct I use to build the quad tree for my terrain renderer. X, z and level are vital, size, min and max values are there for sanity checks, will be eliminated.

So that leaves me with 44 bytes for a node. Not much, but with a huge terrain can still be unwieldy, possibly gigabytes of memory for the quad tree of a continent sized terrain (example 1.4 Gb for a terrain of size 1048576 by 1048576, leaf node size 256*256, 7 lod levels, nodes have 60 bytes with size and min/max values).

Biggest block are the four pointers, would I reduce them to 32 bit int values, that would reduce the structure size (and thus improve performance) considerably and I may end up with "just" 650mb for the same structure.

I stumbled upon this:
https://floooh.github.io/2018/06/17/handles-vs-pointers.html

and several discussions around the topic.

My question asks for an opinion (inadequate for stackexchange), has someone actually realized a project with handles, like written a class "handle" based on 32 bit integers that replicate the behaviour of classic raw pointers, and did it have the hoped-for impact on performance ?

Megalomaniak · Sep 2, 2022

Is/does this all have to be loaded in memory all at once or could some lazy/JIT loading be done? Basically runtime streaming of the data into/out-of memory when needed? Memory might be relatively cheap these days but disk space is cheaper still I would think. I would imagine this would likely have to involve some multiprocessing/threading for all I know.

Note: not a programmer, a "script kiddie" at best far as I myself am concerned. Not to say that what you are asking about is not a worthwhile optimization/consideration. I just can't offer much(well, anything) in that regard.

Pixophir · Sep 2, 2022

Thanks, no worries.

The spatial data structure (quad tree) is an element distinct from the data, used for selection of the visible parts without defining what they actually contain. It is in this case and for real time best kept in memory for optimal performance, while the data at the patches selected for display, height map and other textures, other surface elements, models of things, can be streamed in and out as needed. It doesn't take long to build the quad tree (around 2 seconds for the really big one), which doesn't beg parallelization as it only happens once, and couldn't be done lock free.

So it is best to keep that as small as possible. The question "handles or pointers" arises with really large terrains. Such environments may be nice to behold but would be pretty dull when not filled with things. So I asked myself is it worth the effort.

Maybe I wait 'till the first reactions.

GooGooGaGa · Sep 2, 2022

Having a vulkan implementation of this is quite impressive ^^
Does this allow for non planar terrain ? Like if I put 6 of these can I have a planet terrain based of a heightmap ?

Pixophir · Sep 2, 2022

It is not Vulkan (yet), I'm doing it in OpenGL 4.5 first. This isn't for mobiles anyway.

And yes, my plan is to expand this to a spheroid or even ellipsoid cube map with a quad tree per face. For this to be doable with reasonable effort I need double precision on the graphics card in the vertex stage, which I hope will be there when I am there :-)

Just to make that clear, I will not do the data part because it is a bit too much for a part time hobbyist. If you want fractal or noisy or even terrain based on physical processes like erosion and tectonics you'd have to help out ;-)

I will ponder an interface to place models of things, buildings, caves, and scatterings like plants and rocks. Wouldn't make much sense without.

But first the basis, sorting out the parts to display and finding the visible surface.

Megalomaniak · Sep 2, 2022

Pixophir So it is best to keep that as small as possible. The question "handles or pointers" arises with really large terrains. Such environments may be nice to behold but would be pretty dull when not filled with things. So I asked myself is it worth the effort.

Well, in that case may I raise the question: if you were considering supporting really large terrains in a future version, say "2.0", can this be also handled then or is it better to deal with this now even if the actual/official support for really large^tm terrains would come at a later TBD date?

Pixophir · Sep 2, 2022

Size of the terrain is a bother when streaming data, filling the buffers for drawing, and wrapping them around I think. So if you turn around too quickly you might realize, for a fleeting glimpse, that the Langoliers have already been at work

But that, I fear, cannot be delayed much ... would be like selling a car and charging extra for ... oh, bad example

... will stay with pointers.

Pixophir · Sep 4, 2022

I may have a very early alpha by the end of the next week.

May I be so bold as to ask a colleague if they could lend me a hand and try to compile it under windows ? I plan to push the source code to sourceforge.net. It is stand alone for now (no Godot extension yet), just flat shading and no streaming of data yet. I'd just be interested in hearing if it compiles on Windows, maybe with the MS compiler if that's what most people use.

Dependencies are GLFW3, ImGui, stb_image, libpng, glad and OpenGL development/header files. I can add glad, so that's one low hurdle less, but the rest came from the repositories of my OS (Debian Bookworm), so on Windows there may be some prior configuration efforts necessary.

I don't expect many problems some GNU extension to C++ (like sincos()), but GLFW should take care of the windowing/input handling in a transparent manner.

Tomcat · Sep 4, 2022

Pixophir May I be so bold as to ask a colleague if they could lend me a hand and try to compile it under windows ?

What does it take to do that? In my whole life I have only compiled ArmorPaint from source.