Procedurally creating complex meshes without stutter

xyzxyz Posts: 203Member
edited October 6 in General Support

I have a scene containing MeshInstance node. This node creates its own mesh data (via ArrayMesh) in _enter_tree() callback. The mesh is complex and takes some time to build.

At some point I need to instantiate a lot of such scenes in a single frame. When I do so, complete execution stutters, waiting for meshes to create themselves. The stutter happens even if I instantiate single such scene but to a lesser degree. So scattering them a bit in time is still not good enough solution.

Is there a way to create meshes asynchronously in the background? I tried launching a low priority builder thread in _enter_tree() callback but the stutter is still there. Any suggestions on how to handle this?

Best Answers

  • cyberealitycybereality Posts: 2,089
    Accepted Answer

    So, with a local to scene shader material, it is slightly slower. Then there is a slight pause at 1024 instances, but with 512 it is no problem. However, the first time the shader is compiled, there is a significant pause. However, I can then delete the cubes and recreate them again and it is fast (up to 512, 1024 is still slightly slower than using the default spatial material). I would recommend watching the video in this thread, it may be that the shader compilation trigger is your problem.
    https://godotforums.org/discussion/27677/sharing-my-shader-compile-script

  • xyzxyz Posts: 203
    edited October 12 Accepted Answer

    Alright. Finally managed to iron it all out. It runs smooth as clockwork now :)

    To sum it up and answer my own question:
    Two things to watch out when instancing a lot of procedural geometry:

    1) Vertex data crunching
    If there is a lot of data to be generated/calculated in a single frame, delegate crunching to worker threads so the calculation is distributed to several following frames. Thread code can be a function in generated mesh script.

    2) Shader compilation
    If you're using shader materials, always disable "local to scene" flag for its shader, but keep the flag enabled for material resource itself if you need to animate shader params separately per instance. This will ensure that all materials use the same shader object (compilled only first time), avoiding costly shader re-compilation upon every instance creation.

    Thanks everyone for the help!

«1

Answers

  • cyberealitycybereality Posts: 2,089Moderator

    I would guess Thread would be the answer, but I assume that was what you tried?
    https://docs.godotengine.org/en/stable/tutorials/threads/using_multiple_threads.html

    Maybe it's not the creating the mesh but adding it to the tree that causes slow down (since this would happen on the main thread, even with the building on it's own thread).

  • xyzxyz Posts: 203Member
    edited October 6

    Yeah, I'm using Thread. I'll play around some more to try to pinpoint the bottleneck precisely. But I don't see what else could cause it. If I don't call my mesh creation function the pause is minimal when a large number of scenes get instantiated at once. And that creation function does a lot of computationally expensive stuff.

  • CalinouCalinou Posts: 961Admin Godot Developer

    You could try changing the rendering server thread model to Multi-Threaded in the project settings, but this can introduce bugs as it's not fully reliable in 3.x.

  • xyzxyz Posts: 203Member

    @Calinou
    Ok. Tried to switch from Single-Safe to Multi-Threaded in Rendering->Threads->Thread Model. Project freezes upon start even if I don't use threads at all.

    Hmm, so threads are actually disabled by default? Do I have any other options to run this mesh generation "in background"?

  • xyzxyz Posts: 203Member

    So I tested some more and mesh generation is definitely the bottleneck. I don't see any way around it other than to use threads. And that's causing a crash. This considerably hinders Godot's ability to handle game concepts that include "streaming" procedural meshes. Hopefully, threads will be fully functional in 4.0

    Can I do this in c++? Or the thread problem persists there too?

  • BimbamBimbam Posts: 115Member

    I don't see any way around it other than to use threads. And that's causing a crash.

    Just to confirm, is it crashing or is it stuttering? Because the stutter I have witnessed also.
    Essentially whenever you create/start a thread there is micro stutter and is even present in the 'Background Thread Loading Demo'. Capturing this on camera is obviously a nightmare, so here's the debugger:

    MicroStutter as it appears in the debugger - 3 clicks

    A tip I just found that the docs don't tell you can be found scouring github comments:

    While I am still very much trying to get my head around threading in Godot, my janky reimplementation of the BG loading demo using sempahores did result in observationally less stutter:

    Semaphore stutter blends in with normal usage - 4 clicks

    The debuggers auto-scaling and no axis labels make it a bit hard to directly compare, so I have tried to scale them comparatively to illustrate this:

    Couldn't say if this is situational at this point. The github comments did mention this issue wasn't observed on Linux so could be platform specific (I'm a Win11 & WSL boi as I got sick of dual booting).

  • CalinouCalinou Posts: 961Admin Godot Developer
    edited October 9

    The github comments did mention this issue wasn't observed on Linux so could be platform specific (I'm a Win11 & WSL boi as I got sick of dual booting).

    Indeed, creating threads is much more expensive on Windows than it is on UNIX-like platforms.

    PS: I opened an issue on the documentation repository to track this: https://github.com/godotengine/godot-docs/issues/5310
    Feel free to open a pull request if you'd like :)

  • xyzxyz Posts: 203Member
    edited October 9

    @Bimbam said:

    I don't see any way around it other than to use threads. And that's causing a crash.


    Just to confirm, is it crashing or is it stuttering? Because the stutter I have witnessed also.
    Essentially whenever you create/start a thread there is micro stutter ...

    No, my stutter is not related to threads, but the crash is. The stutter is caused (I suppose) by too much calculation (mesh generation via ArrayMesh) in the frame. I tried to remedy that by delegating this calculation to threads. However it all still behaved like I was calling normal functions instead of threads. Then Calinou suggested I need to enable visual server multi-threading in project settings. After I did that, the project crashes on startup. I didn't inspect further but it must be threads causing it, as they're called on startup too.

    So now instead of one problem I ended up with two :)

    I still don't know how to handle a lot of calculation at once without freezing. I thought threads would help but they crash in my case. The only other approach I see is bruteforcing it in native code...

    Here's a snapshot from profiler. The massive spike I want to eliminate happens when I generate meshes. No threads. The more meshes I generate, larger the spike. All I want to do is decouple this calculation from drawing.

  • cyberealitycybereality Posts: 2,089Moderator

    What happens if you create a new project and enable threading? Does it still crash?

  • xyzxyz Posts: 203Member

    @cybereality said:
    What happens if you create a new project and enable threading? Does it still crash?

    Works as expected. Even if rendering multi-threading is not enabled:

    extends MeshInstance
    
    var t
    
    func workerThread(custom):
        while true:
            print("working for ", custom)
    
    func _ready():
        t = Thread.new()
        t.start(self, "workerThread", self.name, Thread.PRIORITY_LOW)
    
    func _process(delta):
        print ("process ", self.name)
    

    Don't know. I'll try to isolate the thread problem in the actual project...

  • xyzxyz Posts: 203Member
    edited October 10

    I somewhat narrowed the problem. Apparently, instancing a scene that contains a viewport is causing complete freeze-up when multi-threading is enabled :anguished:

    I can't reproduce this in a fresh project. Could be that some other thing is affecting it

  • xyzxyz Posts: 203Member
    edited October 11

    Ok. I found the reason for freeze-up.
    I'm preloading the scene that gets instanced using preload(). But if threading is enabled it looks like my instancing code executes before scene (pre)loading is finished. If the code waits half a second before it instances the scene, it all works fine.

    Any suggestions on how to "properly" handle loading/preloading when threading is enabled?

    The stutter still persist though. Continuing to narrow down the cause...

  • xyzxyz Posts: 203Member
    edited October 11

    So I removed all the geometry generation and thread stuff to see in what degree just mere instancing affects performance. The spikes are still there. Profiler reports that majority of spike time falls on Idle Time. Script Functions time is negligible. Don't know what to make of it.

    Here's profiler snapshot. Larger 125 ms spike at the cursor is caused by instancing 50 scenes (of 2 different types) in a single frame. Smaller spike (ca. 90 ms) happened when instancing 30 scenes (again of 2 types)

    Is instancing a scene that expensive?

  • cyberealitycybereality Posts: 2,089Moderator

    So the instance() call has pretty low overhead in my test. I just made a sample, with a simple cube with a texture, and I can easily call instance and make 1024 cubes appear with random scale and position with no lag. If I jump to 2048 instances, then there is a slight pause, but that is acceptable for that many objects in my opinion. So I don't think it is really instance that is causing it, but whatever model data is inside those scenes. In my case I just had a cube, but if those scenes have detailed models, then that would probably be the culprit. My guess is that the vertex/index buffers are being sent to the GPU and this is causing the stall. If you use MultiMeshInstance, this only sends the vertex buffer once, and should be very fast, but it won't help you with procedural geometry.

  • xyzxyz Posts: 203Member

    Hm... could you try one additional thing on your test case @cybereality?
    Assign a shader material to that cube (plus a simple shader) and make that material local to scene. How does that perform when you make 1024 instances?

  • cyberealitycybereality Posts: 2,089Moderator
    Accepted Answer

    So, with a local to scene shader material, it is slightly slower. Then there is a slight pause at 1024 instances, but with 512 it is no problem. However, the first time the shader is compiled, there is a significant pause. However, I can then delete the cubes and recreate them again and it is fast (up to 512, 1024 is still slightly slower than using the default spatial material). I would recommend watching the video in this thread, it may be that the shader compilation trigger is your problem.
    https://godotforums.org/discussion/27677/sharing-my-shader-compile-script

  • xyzxyz Posts: 203Member
    edited October 12

    Yep! Shader compilation is most likely the offender.

    I need a unique shader for each instance. So "local to scene" must be enabled. This causes the same shader source to be compiled every time a new instance is created, causing cumulative lag. The more scenes I instance at once, the more shaders are compiled, larger the lag.

    The solution in that video suggests creating all shaders at startup to avoid runtime compilation. In their case it's just a couple of shaders. But I need several hundreds of instances that gradually appear/disappear during execution. That would mean creating all instances I'd ever need at startup, and then adding them to main scene as needed.

    Too bad a new shader with exactly the same source code is compiled for each instance just because uniform configuration is different per instance.

    I pushed a lot of mesh animation onto the vertex shader to avoid having thousands of animated spatial nodes. But as it turns out, thousands of nodes might actually be better performance-wise.

    EDIT: Hm... what I said above may not be entirely true. I did some tests and it looks like shader is recompiled only if both, material and shader are set local to scene. If material is local to scene but shader is not then the lag is significantly smaller.

  • xyzxyz Posts: 203Member
    edited October 12 Accepted Answer

    Alright. Finally managed to iron it all out. It runs smooth as clockwork now :)

    To sum it up and answer my own question:
    Two things to watch out when instancing a lot of procedural geometry:

    1) Vertex data crunching
    If there is a lot of data to be generated/calculated in a single frame, delegate crunching to worker threads so the calculation is distributed to several following frames. Thread code can be a function in generated mesh script.

    2) Shader compilation
    If you're using shader materials, always disable "local to scene" flag for its shader, but keep the flag enabled for material resource itself if you need to animate shader params separately per instance. This will ensure that all materials use the same shader object (compilled only first time), avoiding costly shader re-compilation upon every instance creation.

    Thanks everyone for the help!

  • MegalomaniakMegalomaniak Posts: 4,034Admin

    Yeah, shader caching has been a point of discussion for a long while now and AFAIK is supposed to be coming with 4.0

    Not sure how much it will help with your specific case, but it might be worth checking out.

  • cyberealitycybereality Posts: 2,089Moderator

    Awesome! We figured it out.

  • xyzxyz Posts: 203Member
    edited October 12

    @Megalomaniak said:
    Yeah, shader caching has been a point of discussion for a long while now and AFAIK is supposed to be coming with 4.0

    Not sure how much it will help with your specific case, but it might be worth checking out.

    Yeah, it's only tangentially related but I'll check it out. Shader caching is always a good thing.

    Thinking of it, the only real problem was my noobnes :) Godot here handles things in quite a logical way; If you want unique shader for each instance, you specify it via "local to scene" flag and engine makes it. However, I don't see any benefit in creating unique shaders for instances. It just causes lags. Maybe there should be some sort of warning.

    Speaking from couple of months experience hanging here on forums, people tend to get confused about what "local to scene" actually does in general. More so when resources refer to other resources like in this material/shader case. I think it'd be good to add a page or two dedicated to this into documentation.

  • cyberealitycybereality Posts: 2,089Moderator

    Well I've been using Godot for 2 years and I just found out what "local to scene" even did. So yeah.

  • MegalomaniakMegalomaniak Posts: 4,034Admin

    @xyz said:

    Speaking from couple of months experience hanging here on forums, people tend to get confused about what "local to scene" actually does in general. More so when resources refer to other resources like in this material/shader case. I think it'd be good to add a page or two dedicated to this into documentation.

    Yes, not sure if a clearer name for it could be found but certainly, the documentation is open for editing/commits by the community AFAIK.

  • xyzxyz Posts: 203Member
    edited October 13

    @Megalomaniak said:
    Yes, not sure if a clearer name for it could be found but certainly, the documentation is open for editing/commits by the community AFAIK.

    I'll see what I can do.
    The name is not bad but it is somewhat misleading. It suggests that less will happen if you enable it, when in fact a lot more happens - the resource gets duplicated at runtime.

  • MegalomaniakMegalomaniak Posts: 4,034Admin

    Well then a better name for it might be make local copy or something like that, perhaps. If a label/name is misleading then perhaps it's not a great one. ;)

  • xyzxyz Posts: 203Member
    edited October 13

    The name is ok once you know what the flag does. But if you don't know, it's kind of unintuitive. Full name of the property is resource_local_to_scene.

    The state of this flags is only relevant if you use instanced scenes and need to mess with the resource itself (e.g. animating shaders). Which may never be the case in a typical game/workflow when you just use pre-made resources. More likely to be needed if project is inclined towards procedural geometry and animation. My projects typically gravitate to that so I stumbled upon this early.

    Since this property is tied with instancing, it'd be better if the word instance appeared in the name , rather than just scene. Maybe something like resource_unique_in_scene_instance or resource_local_to_scene_instance. But perhaps this is not succinct enough. The tooltip gives decent description of what it does though.

    As for docs, I tried to search up a bit, and the only explanation of this property appears in the Resource class reference. It's only a single sentence identical to gui tooltip. Introductory section that explains the concept of scene instancing is just an overview that doesn't go into details of resource handling.

    A short section named Resources and scene instancing could probably find its way into docs. Maybe some place under Getting Started / Step by Step / Resources / Nodes and resources.

  • cyberealitycybereality Posts: 2,089Moderator

    Yeah, after using Godot for 2 years I've found it does pretty much everything. So far mostly everything I want to do is supported, but most people using the engine don't know like 10% even of the features. The documentation is generally okay (if you are an advanced user), but there are lots of gotchas and you basically have to figure out a lot on your own.

  • cyberealitycybereality Posts: 2,089Moderator

    Looks like this will be fixed in Godot 3.5. Maybe try out the beta and see how it is.

  • MegalomaniakMegalomaniak Posts: 4,034Admin

    @xyz said:
    The name is ok once you know what the flag does. But if you don't know, it's kind of unintuitive. Full name of the property is resource_local_to_scene.

    Yes, but once you know what the flag does, any label would be "ok" even a random mishmash of numbers and letters since just knowing, that the one thing in that exact place dose this thing is enough, if and once you know.

    My point was precisely that a good descriptive name can help with discoverability for those new to it. They might not stumble upon it immediately and its still best also well documented but a good descriptive name does enable people to actually stumble upon it and remember it before they might even need it. ;)

  • xyzxyz Posts: 203Member
    edited October 15

    @Megalomaniak said:
    Yes, but once you know what the flag does, any label would be "ok" even a random mishmash of numbers and letters since just knowing, that the one thing in that exact place dose this thing is enough, if and once you know.
    My point was precisely that a good descriptive name can help with discoverability for those new to it. They might not stumble upon it immediately and its still best also well documented but a good descriptive name does enable people to actually stumble upon it and remember it before they might even need it. ;)

    Agreed. But this particular thing is not that easy to name in a way that would signify the whole concept. Whoever did name it probably choose this after considering other options. I'd bet there was some serious pondering behind it as is with most things in Godot.

    To tie in with what @cybereality said; Godot is an extremely well designed tool. I've been tinkering with it for couple of months now and it constantly surprises me with what it can do. It got almost everything covered. And to be honest, I was sizing it up from distance for about a year prior, thinking the whole time it was a toy.

    As for our beloved "local to scene" :) My actual problem was not even caused by not knowing about it or what it does, but rather by confusing what it means to duplicate a material resource vs a shader resource. I was normally using it with full understanding for mesh resources. Confusion may not happen had the "local to scene" flag had a better coverage in the docs. And I'm not dissing docs in general. They're very well done. But then again I'm probably not a typical user as it seems that most newcomers learn exclusively from tutorials and trial/error...

    To conclude, "local to scene" looks like an obscure flag that always appears at the bottom of the Inspector :) and does nothing most of the time :) When in fact it's a flag that can make a significant difference in performance and storage allocation for projects with extensive scene instancing. So yeah I wouldn't mind a better name (if that's even an option in respect to back-compatibility) and more "promotion" in the documentation.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file