Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large World System #11728

Open
jonbonazza opened this issue Feb 9, 2025 · 14 comments
Open

Large World System #11728

jonbonazza opened this issue Feb 9, 2025 · 14 comments
Assignees

Comments

@jonbonazza
Copy link

jonbonazza commented Feb 9, 2025

Describe the project you are working on

A 3D open world RPG

Describe the problem or limitation you are having in your project

For open world games, where a single scene contains most/all of the game's environment, you end up having a massive number of Nodes in the scene. Each node has a non-zero performance cost in terms of both memory and CPU, and and with sheer number of nodes in a large, complex scene like this, performance becomes a major issue--often to the point that the game is unplayable on even the highest end hardware.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Goals

For this proposed solution, the following goals were defined:

  1. The solution's implementation should be simple. It shouldn't be super intrusive or touch a bunch of the code base. It should be as self contained as possible in order to promote maintainability.
  2. The solution should "just work" without a bunch of setup on the user's end.
  3. The solution should be able to support both 3D as well as 2D games.
  4. The feature should work both at runtime and in-editor

Scope

In an attempt to constrain the problem space and keep the feature succinct, this feature is not concerned with dicing meshes or terrains or such. The entire Node is treated as a single unit. Mesh dicing could be added later if it there's interest

Solution

Note

The term "Node" is used throughout this solution to describe nodes that are tracked by the LWServer. In this context, "Node" refers to either a Node3D or Node2D (or its decendants -- basically anything with a position) depending on whether you are working in 3D or 2D, respectively.

A logical grid will be overlayed onto the scene at Y0 (for 2D games, this grid will be Z0, since the 2D renderer is constrained to a 2D plane). The grids bounds will be calculated automatically based on what is in the scene and the configured cell size. A new LWServer will be created and track what nodes are in what cells of the grid and determine what Nodes should be loaded at any given time. Whenever a Node's position is updated, it will call LWServer to update its cell in the grid. Node3Ds can be marked as "Large World Seeds" by checking a box on the node. In the background, LWServer will load in Nodes (and their associated resources) that are in adjacent cells to each of the "Seeds" and not already loaded, and add them to the scene tree. Similarly, Nodes that are at least two cells away from each seed will be unloaded and removed from the scene tree.

In-Editor

In the editor, the editor viewport's camera position will be used as the "seed". As the user "flys" around the scene, nodes will be loaded/unloaded dynamically. This does have the unfortunate downside of nodes being removed and added from the SceneTree as you fly around the scene. I'm not sure if there's a way to keep a reference to the node in the SceneTree UI when the actual node is gone from the scene. One thing I was thinking was to replace the nodes with a lightweight Placeholder node? It would still consume some memory, but it's better than a full-on node.

Configuration

The following configuration will be available in the project preferences:

  1. Enable LWS - Allows for disabling the LWS entirely. Defaults to true.
  2. LWS Cell Size - the size of each cell in the grid in positional units. Cells are always square, so a cell size of 1024 would mean a 1024x1024 cell. Default: ?

The following configuration will be available in the editor preferences:

  1. Enable LWS in Editor - Allows for disabling LWS in the editor only. Defaults to true

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

A new LWServer Server will be created for managing nodes.

class LWServer {
    Dictionary grid;
    Array seeds;
    Dictionary positionStore;
    
public:
    // This requires a reference to the SceneTree, but I believe it can be obtained as a singleton without passing it in.
    void update_grid_shape();
    void add_seed(Node* node);
    void remove_seed(Node* node);
    void update_node_grid_position(Node* node);
}

On scene ready, and whenever a Node (Node3D or Node2D only) is added/removed from the scene, update_grid_shape will be called, which will recalculate grid bounds and implicitly recalculate all known Nodes' grid cells.

Similarly, whenever a Node's position is updated, it will call LWServer::update_node_grid_position as well.

Node3D and Node2D will have a new property called isLargeWorldSeed that will be exposed via the inspector, and default to true for convenience. When this property is updated, the Node will be added or removed as a seed from LWServer via add_seed and remove_seed, respectively.

On the Process Notification, the seeds list will be looped through and for each seed, will add or remove Nodes from the scene based on the seed's grid cell, and the rules outlined in the Solution section above. The act of adding and removing nodes may implicitly cause Resources that the nodes depend on to be loaded or unloaded as necessary as well.

When a node is removed from the scene, its last known position will be stored in the positionStore cache. When the node is added back, the cache will be checked for a position and if it exists, the node's position will be set to the position from the cache, before the position is removed from the cache.

Warning

While the Node's position is saved, the overall node state will not be persisted (i.e. current property values, animation state, etc...). If other state is desired to be persisted, this can be done via script by overriding the _enter_tree and _exit_tree methods.

Note

If perforamnce is not satisfactory, we could also potentially offer a multithreaded mode, where the processing happens in a background thread.

If this enhancement will not be used often, can it be worked around with a few lines of script?

No

Is there a reason why this should be core and not an add-on in the asset library?

In order for this to be as hands-off as possible, this requires some changes to some of the base node types. It also requires a new Server to be implemented.

Implementation aside, this is such a fundamental feature for certain genres now adays, having something built in is expected by many users.

@jonbonazza
Copy link
Author

jonbonazza commented Feb 9, 2025

With this design, there are still some holes around multiplayer games. For instance, in an authoritative server setup, the decision to add/remove nodes should only happen on the server and the server should tell the clients who control the seed in question whether to add or remove those nodes from their local game. Coincidentaly, this is the same thing as the notion of "interest management" in MMO games. I'd really like @Faless feedback on this matter.

@KeyboardDanni
Copy link

KeyboardDanni commented Feb 9, 2025

Somewhat tangential, but I think it'd be a good idea to also find some way to do "long-running initialization" of newly-spawned world cells asynchronously without causing the game to stutter. For example, setting up a complex collision mesh, or loading resources that might be needed for monsters in a certain area. And if going this route, the developer should have control over what happens if the load takes longer than expected. Jak & Daxter for the PS2 amusingly causes the player to trip if they somehow outrun the disc loading.

Probably a topic for a separate proposal, but something that might need consideration at some point.

@badsectoracula
Copy link

I'll repeat what i mentioned when this was brought up in rocket chat: having the streaming system modifying the world structure (nodes in this case) is a very bad idea and will result in a cascade of bugs. It sounds like a simple system but in practice what it does is to spread the complexity almost everywhere nodes are touched (and certainly to everything that needs to enumerate on node children). I worked in an engine that did that ~12 years ago and it was a massive pain from the moment it was added to the engine to the end of the project (and at least after the game was done it could be revamped to something better - this is not something that Godot can do as easily as it has stricter backwards compatibility constraints).


Let's consider a simple case in the editor (something that was a real bug that affected the engine i mentioned): you create a node, make it streamable so that is unloaded when the camera goes away and then goes back and that seems to be working fine. But then later you select that node, delete it, move the camera far away and you press undo. What happens then? If the solution is to somehow have the undo system "know" about streaming then that itself would be a seed for the issues i mentioned. BTW just deletion isn't the issue, another case would be things like: you move the node, move away, press undo. What now? And then you come back so that the node would be in - but the node now is in the previous position since the position was loaded from the cache.

This isn't just the undo system that'd be affected, that is just an example. Another would be gameplay - i.e. stuff outside the editor. As an example, assume you have a node representing a breakable tree which is set as streamable because the tree mesh is very heavy. The player goes and destroys the tree, moves away, comes back and suddenly the tree is back because the developer forgot to store the tree state (BTW: i have seen a similar gameplay bug in a different engine for a different game i worked on: crafting components in the world like herbs would be unloaded and loaded when the player moved away and came back, causing the herbs to respawn - the herbs were meant to respawn after some time, so the bug went unnoticed until i watched how things load/unload based on the streaming system in that engine - which BTW had a hierarchical node-based system like in Godot and worked by loading/unloading nodes - and realized what was going on). You may think that developers shouldn't setup things like that but if 12 years ago i had a cent for every time i told someone to not use anything stateful with the streaming system but they did it anyway, i'd still be living off that money. People will do that, waste time on bugs rising from that and blame the streaming system for these bugs. It is better to make those bugs impossible in the first place.


There are other issues that can happen but i think it is clear that this is a bad idea. Fortunately it is easy to make some changes to the idea without making things complicated (if anything it'd make things simpler as the complexity will only affect the systems that explicitly need to worry about streaming and wont spread to practically all systems that work with nodes, as it'd be with this approach).

Here are the changes i suggest:

  1. Instead of loading and unloading nodes keep them in memory and instead notify them to load and unload their heavy resources (e.g. call a virtual load_streamed_data and unload_streamed_data or something like that). The nodes themselves aren't big in memory (about a KB - ok, that is big TBH, but relatively speaking they are tiny), you'd need more than a million nodes to have a real memory impact (and you'll hit other engine and editor limits before reaching that anyway - limits that themselves should be addressed and not have the streaming system be used as a workaround for them). Most nodes do not need to be streamed anyway, it is things like meshes, long audio clips, etc and calculated things like the LightmapGI data that need streaming and the actual heavy part isn't the node that references those meshes/clips/etc but the meshes/clips/data/etc themselves. So the load/unload virtual calls for a mesh will simply ref/deref the mesh. Any other state will be kept in memory together with the node, eliminating a large swath of potential bugs.
  2. Instead of treating everything as streamable, only nodes that register themselves with the large world server will be streamed. Not only this will allow nodes to opt-in during their development (which will avoid bugs from unloading things that existing code expected to always be there) but also will speed up the process of handling those nodes.
  3. Nodes should provide the large world server with their streaming range when they get registered with some call to update their state (range and position). It should not be the server that goes and enumerates all nodes to figure out that stuff as there may be many thousands of them. When a node's position changes it should notify the server. Node subclasses should use node-specific logic for these ranges - for example meshes could use their visibility range (if set, if not the mesh node would not be considered as streamable and wont be registered with the large world server).
  4. Instead of a fixed 2D grid, a hierarchical 3D grid (with halving size at ever hierarchy level) should be used instead to speed up both insertions and queries - essentially when a node is inserted/updated, it should be placed into the hierarchy level where a single cell would contain it. Then when it is time to figure out which nodes are to be loaded, you only need to check the seeds (cameras, etc) positions at each hierarchy and load the nodes registered in the cell each seed/camera is in (you could also do a range check at that point, at least for the higher hierarchy levels as below some level it'd be pointless). Being a 3D grid it'd also help with games that have vertical worlds. Cell size is still useful for specifying the minimum cell size. BTW: the "3D grid bit" is conceptual, the actual implementation could use a single map of x,y,z,l (l is the hierarchy level) to store the cells so that there wont be any need to go and calculate actual world dimensions nor spend too much memory on empty cells.

There are other things to keep in mind (e.g. collecting all nodes to load/unload where loading should always happen immediately but unloading should only happen if the seeds moved further than some threshold distance from the last loading they caused to avoid streaming stuff in and out from small camera animations like shaking or even the player rotating the camera around a character for 3rd person games) but at the core i think the above are the most important (that i can think of right now at least).

@jonbonazza
Copy link
Author

jonbonazza commented Feb 9, 2025

having the streaming system modifying the world structure (nodes in this case) is a very bad idea and will result in a cascade of bugs. It sounds like a simple system but in practice what it does is to spread the complexity almost everywhere nodes are touched (and certainly to everything that needs to enumerate on node children). I worked in an engine that did that ~12 years ago and it was a massive pain from the moment it was added to the engine to the end of the project (and at least after the game was done it could be revamped to something better - this is not something that Godot can do as easily as it has stricter backwards compatibility constraints).

I'm not sure what engine you were using before, but adding and removing nodes from the scene is exactly what godot is designed for. The entire node system is designed around this use case, so I don't see this being a problem. I'd rather frame this discussion around Godot rather than hypotheticals from some engine you used 12 years ago.

Let's consider a simple case in the editor (something that was a real bug that affected the engine i mentioned): you create a node, make it streamable so that is unloaded when the camera goes away and then goes back and that seems to be working fine. But then later you select that node, delete it, move the camera far away and you press undo. What happens then? If the solution is to somehow have the undo system "know" about streaming then that itself would be a seed for the issues i mentioned. BTW just deletion isn't the issue, another case would be things like: you move the node, move away, press undo. What now? And then you come back so that the node would be in - but the node now is in the previous position since the position was loaded from the cache.

I agree that the editor use case does make this more complicated, but I don't think this it's impossible to solve. And it can be solved in isolation.

Instead of loading and unloading nodes keep them in memory and instead notify them to load and unload their heavy resources (e.g. call a virtual load_streamed_data and unload_streamed_data or something like that). The nodes themselves aren't big in memory (about a KB - ok, that is big TBH, but relatively speaking they are tiny), you'd need more than a million nodes to have a real memory impact (and you'll hit other engine and editor limits before reaching that anyway - limits that themselves should be addressed and not have the streaming system be used as a workaround for them). Most nodes do not need to be streamed anyway, it is things like meshes, long audio clips, etc and calculated things like the LightmapGI data that need streaming and the actual heavy part isn't the node that references those meshes/clips/etc but the meshes/clips/data/etc themselves. So the load/unload virtual calls for a mesh will simply ref/deref the mesh. Any other state will be kept in memory together with the node, eliminating a large swath of potential bugs.

This would require bespoke changes to every node that Godot offers. This is frankly not feasible or even desired.

Instead of treating everything as streamable, only nodes that register themselves with the large world server will be streamed. Not only this will allow nodes to opt-in during their development (which will avoid bugs from unloading things that existing code expected to always be there) but also will speed up the process of handling those nodes.

What problem would this solve and what use case would warrant this? I can't think of a single reason why any game would not want all nodes with a position to be streamable. If tehre's a use case, I'd consider it though. It wouldn't be difficult to implement.

Nodes should provide the large world server with their streaming range when they get registered with some call to update their state (range and position). It should not be the server that goes and enumerates all nodes to figure out that stuff as there may be many thousands of them. When a node's position changes it should notify the server. Node subclasses should use node-specific logic for these ranges - for example meshes could use their visibility range (if set, if not the mesh node would not be considered as streamable and wont be registered with the large world server).

Nodes cannot and should not know about other nodes. This is the reason Servers exist. The design already has each node tell the serer when its position changes and updates its grid position on demand. The Server is only responsible for adding/removing nodes that are in adjacent cells. It doesnt enumerate every node in the scene. In most scenes. If performance is a concern (which I'd like to see data to support this before increasing complexity. In general, we should design for the simplest solution possible, not design for hypothetical problems), I'd imagine that doing it immediately when a node is moved would be even more costly. I'm not against it, but I'd like to test to see which solution is more feasible. As the "note" that I put in place mentions, we could also offload this to another thread if performance turns out to be an issue.

Instead of a fixed 2D grid, a hierarchical 3D grid (with halving size at ever hierarchy level) should be used instead to speed up both insertions and queries - essentially when a node is inserted/updated, it should be placed into the hierarchy level where a single cell would contain it. Then when it is time to figure out which nodes are to be loaded, you only need to check the seeds (cameras, etc) positions at each hierarchy and load the nodes registered in the cell each seed/camera is in (you could also do a range check at that point, at least for the higher hierarchy levels as below some level it'd be pointless). Being a 3D grid it'd also help with games that have vertical worlds. Cell size is still useful for specifying the minimum cell size. BTW: the "3D grid bit" is conceptual, the actual implementation could use a single map of x,y,z,l (l is the hierarchy level) to store the cells so that there wont be any need to go and calculate actual world dimensions nor spend too much memory on empty cells.

I'm not sure what problem this solves. A hierarchal grrid is wholly unnecessary. The only things that matter are what cell the seed node is in and what are in adjacent cells. This is extremely trivial math to compute. What value would a hierarchal grid provide? That said, a 3D grid could be useful, if verticality is something we think matters.

There are other things to keep in mind (e.g. collecting all nodes to load/unload where loading should always happen immediately but unloading should only happen if the seeds moved further than some threshold distance from the last loading they caused to avoid streaming stuff in and out from small camera animations like shaking or even the player rotating the camera around a character for 3rd person games)

The design already accounts for this. Only nodes in immediately adjacent cells are loaded, but nodes that are up to 2 cells away are unloaded. I can add some diagrams to illustrate this, but this is the exact approach used by "interest management" in MMORPGs. It's a solved problem.

The only proposed change that I'm wholly against is the first one. It would simply require too much changes to the code base to be feasible and I'm not convinced the issues you bring up are actual issues.

If you can get buy-in from the rest of the team though, I'd concede, as it would certainly simplify some aspects of things.

@fire fire self-assigned this Feb 9, 2025
@badsectoracula
Copy link

I'm not sure what engine you were using before, but adding and removing nodes from the scene is exactly what godot is designed for. The entire node system is designed around this use case, so I don't see this being a problem. I'd rather frame this discussion around Godot rather than hypotheticals from some engine you used 12 years ago.

I'm just sharing my experience working on a project that did pretty much the same thing described here.

I agree that the editor use case does make this more complicated, but I don't think this it's impossible to solve. And it can be solved in isolation.

The problem is that it will not be "in isolation" but almost everything that has to deal with nodes can be affected, not even just the editor. In fact you already brought one issue yourself when you mentioned node state in the proposal's warning box.

This would require bespoke changes to every node that Godot offers. This is frankly not feasible or even desired.

It will only require changes to the nodes that deal with large resources - i.e. meshes, etc. Most nodes are not like that and will not need to be touched.

What problem would this solve and what use case would warrant this? I can't think of a single reason why any game would not want all nodes with a position to be streamable. If tehre's a use case, I'd consider it though. It wouldn't be difficult to implement.

As i already wrote, better performance and less developer burden. Most node types do not need to be streamed.

Nodes cannot and should not know about other nodes. This is the reason Servers exist.

Yes, no part of what i wrote mentions anything about nodes knowing about other nodes. The nodes that have streamable data will register themselves with the server, i didn't mention any node-to-node communication.

The design already has each node tell the serer when its position changes and updates its grid position on demand. The Server is only responsible for adding/removing nodes that are in adjacent cells. It doesn't enumerate every node in the scene.

The use of upgrade_grid_shape has that implication and the original proposal already mentions that moving a node may need to recalculate the grid cells for all known nodes.

As the "note" that I put in place mentions, we could also offload this to another thread if performance turns out to be an issue.

Threading isn't an automatic solution for solving performance issues with an algorithm, placing a slow algorithm on a thread only put these issues in the background but they are still there - taking CPU time from other threads/tasks that could be using it instead, thus affecting the rest of the engine too.

I'm not sure what problem this solves. A hierarchal grrid is wholly unnecessary. The only things that matter are what cell the seed node is in and what are in adjacent cells. This is extremely trivial math to compute. What value would a hierarchal grid provide?

Not all nodes will have the same streaming distance - they should not have the same streaming distance. A rock, a barrel, a fence, a tree, a house, a statue and a castle have different ranges from the camera that need to be streamed in. Treating everything as if it needs the same distance is essentially forcing the worse distance to the smaller items. Loading a rock and a barrel in from hundreds of meters before they'd actually be needed because the castle needs a large streaming distance is not a good idea as that will waste both CPU and memory resources (essentially everything around the castle's streaming distance will be loaded in memory regardless of it'd be necessary or not).

Once there is position and range there is a need to quickly find the nodes whose range is within the camera. A single fixed grid would need to store the same node in multiple cells, meaning that for nodes that span multiple cells they'd need to be added and removed from a bunch of lists every time their position changes and a camera will need to find all the cells in its range and go through all of them and all of their nodes. A hierarchical grid avoids this by using the appropriate hierarchy level based on the node's range and so the code will only have to worry about four cells at most. This may not seem like a big deal but the more parts that need to be updated, the slower the system will be since these stack up and will show once there are many thousands of nodes (i.e. when a streaming system is needed in the first place).

Finally, since this uses a map of coordinates there is no reason to worry about grid shape at all so that entire logic goes away.

Of course the structure needed to store the data isn't really that important as it can be changed without affecting anything else, a hierarchical grid is just something that i've worked with at the past exactly for the purpose mentioned here. The important part is that both position and distance are needed to do streaming properly.

The only proposed change that I'm wholly against is the first one. It would simply require too much changes to the code base to be feasible

Not really, if anything having nodes come and go based on camera distance will require many more changes in the entire codebase whereas what i write will only need to change the nodes that'd need streaming and not anything else. The system i describe is actually fairly simple to implement and, if anything, considering that very little will need to deal with streaming (since systems wont have to deal with nodes coming and going beneath their feet) i'd say that overall is the simpler approach.

In any case, these are my concerns and the solutions i propose to avoid them. As i wrote in rocketchat last time, at the end whoever writes the code gets the last say (or well, second to last, considering merging and all :-P) and personally i wont write the streaming system anyway.

@fire
Copy link
Member

fire commented Feb 9, 2025

How do we determine what node properties are "heavy"?

Here's my attempted interpretation of the @badsectoracula design

Persistent Nodes

  • Nodes remain in memory; only heavy resources (meshes, audio, lightmaps) are streamed via load_streamed_data()/unload_streamed_data() virtual methods
  • Maintains node state consistency while managing memory via ref-counting of heavy assets

Opt-In Registration System

  • Explicit node registration prevents unexpected unloading
  • Nodes provide initial position and streaming range
  • Range calculation delegated to node subclasses (e.g., mesh visibility range)

Hierarchical 3D Grid

  • Multi-level grid with cells doubling in size at each level (2^L)
  • Node placement logic:
  • Storage as sparse coordinate map:
    Map<tuple<int, int, int, int>, Cell> grid_hierarchy;
    (x,y,z coordinates + level)

Update Pipeline

  • Position updates trigger cell re-evaluation:

Visibility Determination

  • Per-frame process for active viewers (cameras):

Comments

@badsectoracula

  • Agrees this is a representative summary
  • a Map<uint32_t, Cell> should be fine for the grid hierarchy
  • The XYZL parts can be stored as 8 bit uints and with a cell size of 256m
  • an 8 bit value will have a grid covering 64k by 64k which should be more enough as that is 100+ times larger than Skyrim 😛

@fire
Copy link
Member

fire commented Feb 9, 2025

Here's my solution to the heavy property problem. It's inspired by Godot’s MissingNode, we replace heavy data (textures, meshes, binary arrays) with lightweight placeholders that mimic the original type to prevent errors.

For resources, placeholders store file paths and class names; non-resource data (like large byte arrays) uses a hidden cache. Changes to placeholders are tracked and reapplied on restoration, integrating via inspector-exposed toggle on Node.

@KeyboardDanni
Copy link

KeyboardDanni commented Feb 9, 2025

Let's consider a simple case in the editor (something that was a real bug that affected the engine i mentioned): you create a node, make it streamable so that is unloaded when the camera goes away and then goes back and that seems to be working fine. But then later you select that node, delete it, move the camera far away and you press undo. What happens then? If the solution is to somehow have the undo system "know" about streaming then that itself would be a seed for the issues i mentioned. BTW just deletion isn't the issue, another case would be things like: you move the node, move away, press undo. What now? And then you come back so that the node would be in - but the node now is in the previous position since the position was loaded from the cache.

Are we talking about deleting the parent node that loads the cell, or something inside that cell? If the former, that's a pretty trivial case. If the latter, I think what Unity does is it tracks which prefabs are currently "locked". In this case it could keep track of what streamable nodes are loaded, and if an undo/redo operation handles them, it could just load that back in. Or just have undo/redo history isolated to that particular streamable node, and/or give a warning when an undo/redo operation involves something that might be unloaded.

This isn't just the undo system that'd be affected, that is just an example. Another would be gameplay - i.e. stuff outside the editor. As an example, assume you have a node representing a breakable tree which is set as streamable because the tree mesh is very heavy. The player goes and destroys the tree, moves away, comes back and suddenly the tree is back because the developer forgot to store the tree state (BTW: i have seen a similar gameplay bug in a different engine for a different game i worked on: crafting components in the world like herbs would be unloaded and loaded when the player moved away and came back, causing the herbs to respawn - the herbs were meant to respawn after some time, so the bug went unnoticed until i watched how things load/unload based on the streaming system in that engine - which BTW had a hierarchical node-based system like in Godot and worked by loading/unloading nodes - and realized what was going on). You may think that developers shouldn't setup things like that but if 12 years ago i had a cent for every time i told someone to not use anything stateful with the streaming system but they did it anyway, i'd still be living off that money. People will do that, waste time on bugs rising from that and blame the streaming system for these bugs. It is better to make those bugs impossible in the first place.

You're probably gonna have to save and load some amount of state. This seems like something that'd be pretty hard to avoid. Instead I think there should be focus here on having a useful facility for the user to handle loading/saving of this state. The proposal already mentions _enter_tree() and _exit_tree() which ought to be enough for most scenarios, though it might help if the nodes knew a bit more about the large world system so they could more intelligently handle spawning/despawning behavior within the context of the LWS.

  1. Instead of loading and unloading nodes keep them in memory and instead notify them to load and unload their heavy resources (e.g. call a virtual load_streamed_data and unload_streamed_data or something like that). The nodes themselves aren't big in memory (about a KB - ok, that is big TBH, but relatively speaking they are tiny), you'd need more than a million nodes to have a real memory impact (and you'll hit other engine and editor limits before reaching that anyway - limits that themselves should be addressed and not have the streaming system be used as a workaround for them). Most nodes do not need to be streamed anyway, it is things like meshes, long audio clips, etc and calculated things like the LightmapGI data that need streaming and the actual heavy part isn't the node that references those meshes/clips/etc but the meshes/clips/data/etc themselves. So the load/unload virtual calls for a mesh will simply ref/deref the mesh. Any other state will be kept in memory together with the node, eliminating a large swath of potential bugs.
  2. Instead of treating everything as streamable, only nodes that register themselves with the large world server will be streamed. Not only this will allow nodes to opt-in during their development (which will avoid bugs from unloading things that existing code expected to always be there) but also will speed up the process of handling those nodes.
  3. Nodes should provide the large world server with their streaming range when they get registered with some call to update their state (range and position). It should not be the server that goes and enumerates all nodes to figure out that stuff as there may be many thousands of them. When a node's position changes it should notify the server. Node subclasses should use node-specific logic for these ranges - for example meshes could use their visibility range (if set, if not the mesh node would not be considered as streamable and wont be registered with the large world server).

The problem with this is that you also need to consider the game logic overhead of keeping things spawned in. There is a very good reason why 8-bit and 16-bit games only spawned in enemies as you got close to them, and aggressively despawned them when they went outside the view. The systems of the time were just barely powerful enough CPU-wise to handle the logic for nearby moving objects (and in a lot of cases it wasn't enough to keep everything running at 60 FPS).

Even today this is an issue. Every house, every structure, every enemy, every tree, every apple you see has to live inside the physics system, where it takes up space in the BVH tree. Each physics area and body means physics lookups are that much slower because the tree has that much more depth in order to contain everything. Imagine if Breath of the Wild kept everything spawned in at once! First of all, I'm not even sure it'd all fit in memory, and even if it did, the game would run like a slideshow from all the required bookkeeping. That's half the reason why all these small detailed objects are streamed in and out as you move through the world.

The crux of the issue seems to be "despawned objects forget their state" which is an easy enough problem to solve, and the proposed solution is "keep all nodes available" which isn't going to scale well, and is one of the reasons that large world systems exist in the first place.

  1. Instead of a fixed 2D grid, a hierarchical 3D grid (with halving size at ever hierarchy level) should be used instead to speed up both insertions and queries - essentially when a node is inserted/updated, it should be placed into the hierarchy level where a single cell would contain it. Then when it is time to figure out which nodes are to be loaded, you only need to check the seeds (cameras, etc) positions at each hierarchy and load the nodes registered in the cell each seed/camera is in (you could also do a range check at that point, at least for the higher hierarchy levels as below some level it'd be pointless). Being a 3D grid it'd also help with games that have vertical worlds. Cell size is still useful for specifying the minimum cell size. BTW: the "3D grid bit" is conceptual, the actual implementation could use a single map of x,y,z,l (l is the hierarchy level) to store the cells so that there wont be any need to go and calculate actual world dimensions nor spend too much memory on empty cells.

This is a good point though. But I'd like to extend this further. I don't think the LWS should make any assumptions about the layout of the cells, just that there are cells. Ideally the user should be able to choose whether these cells are arranged in a 2D or a 3D grid. After all, a lot of games don't need a 3D grid (and indeed a 3D grid could cause unnecessary issues), yet other games need more than a 2D grid if they take advantage of vertical space. And then there's dungeon crawlers where you want to split things up by rooms rather than have things be on a grid.

@KeyboardDanni
Copy link

Here's my solution to the heavy property problem. It's inspired by Godot’s MissingNode, we replace heavy data (textures, meshes, binary arrays) with lightweight placeholders that mimic the original type to prevent errors.

For resources, placeholders store file paths and class names; non-resource data (like large byte arrays) uses a hidden cache. Changes to placeholders are tracked and reapplied on restoration, integrating via inspector-exposed toggle on Node.

I think it could be useful to extend this concept to spawning/despawning entire nodes. When a node goes out of range, we could despawn it and replace it with an inactive "dummy", maybe like a DummyNode or InactiveNode or DespawnedNode or something. Its whole purpose would be to store custom data when an object despawns. So for a tree, you could just have it store whether the tree was chopped or not. You'd be able to query this information while the node is despawned, and then the tree would load the information when spawned back in. Until then, the tree wouldn't have to load any resources like meshes, textures, etc. or take up space in the BVH tree, among other things.

@fire
Copy link
Member

fire commented Feb 9, 2025

I wanted an optimization on the sparse coordinate map Map<tuple<int, int, int, int>, Cell> grid_hierarchy (x,y,z coordinates + level) via brickworks. Brickworks is a game world partitioning method that arranges cells in a staggered grid pattern (like bricks in a wall) and divides each cell into a 3x3 grid of virtual subcells, optimizing network efficiency by syncing only relevant subcells during player movement and simplifying neighbor management by reducing adjacency complexity.

Image https://pure.bond.edu.au/ws/portalfiles/portal/18275482/Efficient_Methods_for_Improving_Scalability_and_Playability_of_Massively_Multiplayer_Online_Game_MMOG_.pdf

@KeyboardDanni
Copy link

Having said the above though, I do still have concerns about having so many Nodes in the scene tree in general causing issues with performance or memory usage. Every node in the tree means that much more bookkeeping, and the larger that information gets, the sooner you start having cache misses, and then things can really start to slow down when performing common operations on the scene tree and contents.

@badsectoracula
Copy link

badsectoracula commented Feb 9, 2025

Are we talking about deleting the parent node that loads the cell, or something inside that cell?

Deleting a node (or an ancestor) that would be unloaded by the streaming system.

I think what Unity does is it tracks which prefabs are currently "locked". In this case it could keep track of what streamable nodes are loaded

Right and that is the issue: every system that deals with nodes will have to deal with nodes being loaded/unloaded beneath their feet by locking/unlocking them, streaming their child nodes in whenever they need to be enumerated and other boilerplate houskeeping tasks outside the core task that each subsystem is meant to handle. Again, as i wrote, "This isn't just the undo system that'd be affected, that is just an example. Another would be gameplay - i.e. stuff outside the editor".

You're probably gonna have to save and load some amount of state. This seems like something that'd be pretty hard to avoid.

This is only necessary if the streaming system adds/removes nodes instead of asking the nodes to load/unload the heavy data (what is heavy is what uses a lot of memory - i.e. the actual meshes, textures, audio, lightmap data, etc, not the nodes themselves - and that can be determined through memory profiling).

which ought to be enough for most scenarios

What i suggest is to eliminate the need for most of those scenarios in the first place.

The problem with this is that you also need to consider the game logic overhead of keeping things spawned in.

The streaming system should not be used to deal with gameplay logic flaws. It could be abused for that if someone really feels the need for it, but the same can be done using the notifications/virtual calls i mentioned above. In fact it can be done in an even better way since the game will be in control of how the nodes with logic will be handled (e.g. they could be frozen by not running any logic if they are in an "unloaded" state) instead of the brute-force approach of removing them completely - which would lose all state data too and now you have to deal with such data storage (and persistence - for example for savegames).

There is a very good reason why 8-bit and 16-bit games only spawned in enemies as you got close to them, and aggressively despawned them when they went outside the view.

Yes and the reason is that those systems barely had any memory and CPU cycles to deal with. But we're not in 8bit and 16bit era, we're in 64bit era, the restrictions and solutions one had to deal with 30 years ago are not the same as those we have to deal with today. If anything what i describe would also apply in 32bit era too, at least on systems with adequate memory.

Even today this is an issue. Every house, every structure, every enemy, every tree, every apple you see has to live inside the physics system, where it takes up space in the BVH tree. Each physics area and body means physics lookups are that much slower because the tree has that much more depth in order to contain everything.

There are multiple ways to address this that do not have with having the rest of the codebase dealing with nodes added and removed from the world based on the camera position, starting with trying to optimize the structure itself so that queries on one side of the world are not affected by modifications on the other side. Nodes that have visual data to be unloaded could also remove the physics data from the physics system - after all if a game is fine with nodes getting removed based on a camera position would also have to deal with physics getting removed too. Physics would have to deal with that either way.

Imagine if Breath of the Wild kept everything spawned in at once! First of all, I'm not even sure it'd all fit in memory, and even if it did, the game would run like a slideshow from all the required bookkeeping.

I do not have access to the source code of BotW's system to know how it works, so i can't tell what exactly it keeps in memory and what goes away.

That's half the reason why all these small detailed objects are streamed in and out as you move through the world.

Streaming small detailed objects is not the same as streaming the data needed for those objects. You could unload mesh and textures and keep everything else.

The crux of the issue seems to be "despawned objects forget their state"

No, the crux of the issue is that adding and removing nodes will create all sorts of problems all over the codebase in places where nodes are handled which will create a cascade of bugs that in isolation might seem simple to fix but overall will affect everything, including the game logic people who make games in Godot will write (and even reusable plugins that need to deal with nodes, so that complexity will spread even to games that do not need streaming in the first place).

proposed solution is "keep all nodes available" which isn't going to scale well

There is always going to be a scaling point after which the engine will not be able to handle, a streaming system is not a panacea that gets rid of all scaling issues, it just expands the limits - but there are still limits. The goal is to ensure those limits contain what Godot users will use the engine for.

After all, a lot of games don't need a 3D grid (and indeed a 3D grid could cause unnecessary issues), yet other games need more than a 2D grid if they take advantage of vertical space. And then there's dungeon crawlers where you want to split things up by rooms rather than have things be on a grid.

The map i mentioned (which btw would be a uint32-to-Cell- map, not a tuple, using 8bits for cell coordinates and hierarchy level with a cell size of 256 would still allow for a world that is more than 100 times larger than something like Skyrim, which should be fine :-P) should be able to handle both 2D and 3D worlds just fine without any changes (just use 0 for z), though IMO the data structure isn't really that important at this point as it can be changed at any time without affecting the rest of the codebase. You could even just register all nodes in a single array and go through all of them for each update as an MVP implementation to be replaced later. The important part is how the system will work and affect the rest of the codebase.

@fire
Copy link
Member

fire commented Feb 9, 2025

I would probably use Map<int64_t, Cell> as it'll be compatible with gdscript and this will be utter overkill compared to Skyrim sizes.

@KeyboardDanni
Copy link

I think what Unity does is it tracks which prefabs are currently "locked". In this case it could keep track of what streamable nodes are loaded

Right and that is the issue: every system that deals with nodes will have to deal with nodes being loaded/unloaded beneath their feet by locking/unlocking them, streaming their child nodes in whenever they need to be enumerated and other boilerplate houskeeping tasks outside the core task that each subsystem is meant to handle. Again, as i wrote, "This isn't just the undo system that'd be affected, that is just an example. Another would be gameplay - i.e. stuff outside the editor".

This is to be expected from a Large World System. It's more productive to try and find scenarios that might need to be handled, than to write this off entirely. The whole point of a proposal is to explore where corner cases might be and try to find solutions to them.

For example, one solution to the issue of "partial despawning", i.e. one related object disappearing when another depends on it, would be to come up with a group system so that everything is despawned together.

You're probably gonna have to save and load some amount of state. This seems like something that'd be pretty hard to avoid.

This is only necessary if the streaming system adds/removes nodes instead of asking the nodes to load/unload the heavy data (what is heavy is what uses a lot of memory - i.e. the actual meshes, textures, audio, lightmap data, etc, not the nodes themselves - and that can be determined through memory profiling).

You neglected one thing in your example of chopping down a tree, one that extends beyond the scope of a Large World System: the user can exit and restart the game to make the tree respawn. So you have to store the tree's state anyway.

You also neglect the fact that spawned objects have more than just memory overhead. There's is a significant CPU cost to maintaining and even reading large data structures. The larger your arrays are, the longer it takes to search them to find what you're looking for. The larger your trees are, the deeper they are, and the more comparisons you have to make to search them. This is exacerbated by CPU cache: if a region of memory is larger than 32 kilobytes (which is easy once the structure gets large enough), it no longer fits in L1 cache, and you instantly have a performance penalty. Same goes for running out of space in L2 and L3 cache.

which ought to be enough for most scenarios

What i suggest is to eliminate the need for most of those scenarios in the first place.

What you suggest is based off the assumption that you can keep nodes spawned in. This defeats the purpose of a Large World System.

The problem with this is that you also need to consider the game logic overhead of keeping things spawned in.

The streaming system should not be used to deal with gameplay logic flaws. It could be abused for that if someone really feels the need for it, but the same can be done using the notifications/virtual calls i mentioned above.

Zelda 1 for the NES, Link's Awakening for the Game Boy, and A Link To The Past for the SNES all had rooms with puzzles that could be reset by leaving the room. This is not a mistake but an intentional use of spawning/despawning objects to avoid a softlock.

In fact it can be done in an even better way since the game will be in control of how the nodes with logic will be handled (e.g. they could be frozen by not running any logic if they are in an "unloaded" state) instead of the brute-force approach of removing them completely - which would lose all state data too

Just because a Large World System spawns or despawns nodes, doesn't mean it wouldn't give the developer the opportunity to decide what happens when a node spawns or despawns. This includes persisting data.

and now you have to deal with such data storage (and persistence - for example for savegames).

Again, you would have to deal with that anyway. This is tangential to the Large World System, which isn't introducing any issues you wouldn't already have to contend with here.

There is a very good reason why 8-bit and 16-bit games only spawned in enemies as you got close to them, and aggressively despawned them when they went outside the view.

Yes and the reason is that those systems barely had any memory and CPU cycles to deal with. But we're not in 8bit and 16bit era, we're in 64bit era, the restrictions and solutions one had to deal with 30 years ago are not the same as those we have to deal with today. If anything what i describe would also apply in 32bit era too, at least on systems with adequate memory.

The callback to the 8-bit and 16-bit era was an example for how keeping things spawned in has a CPU cost. That hasn't really changed since then. Yes, our systems are faster now, but there are still practical limits to contend with.

Super Mario World was designed for a system with a 3mhz CPU. It had a hard limit of 12 fully interactable objects. Any more than this, and objects stop spawning in. And by fully interactable, I mean that enemies can bounce off of each other. This requires them checking collision with each other, an operation with high time complexity (O(n^2) worst case), meaning up to 12 * 12 = 144 collision checks. There are many locations in the game where the game will lag, so it's safe to say that we can't even handle that many objects at 3mhz.

But let's just assume that 3mhz is enough to handle 12 full objects. We scale that up to today's CPUs which run at 3ghz. So we multiply that figure of 12 by 1,000 to get 12,000 objects. We also take into account improvements to instructions-per-cycle, and multiply that by 8 and get 96,000. That looks like a lot. Except that's way more than can fit in L1 cache, so some of it has to be put into L2 cache. So let's divide that by two because of the L2 cache access time penalty to get 48,000. Great! We have a ballpark estimate of 48,000 objects!

...Except that won't scale well with algorithms that have worse than linear time complexity. Let's multiply 48,000 * 48,000. That's 2,304,000,000 separate collision checks. That's way more than any computer now or in the near future can handle, even if you don't consider the negative performance impact of this not fitting in L1 cache (and the algorithm having to access objects across 4-kilobyte cache line boundaries, which is the real performance killer when it comes to cache locality).

Even if you consider that a Bounding Volume Hierarchy will reduce this so that in the average case, one object will only check collision with log(N) objects (16 in this case), you still have all your objects checking collisions as they move, meaning N * log(N) time complexity. 16 * 48,000 = 768,000. That's still way too many collision checks. Compare this with having 1/1000th as many objects, which even a CPU 1/1000th the speed is able to handle (most of the time). Basically, algorithmic complexity often does not scale well with improvements to CPU performance. And our CPUs haven't even been getting that much faster lately.

And yes, if you need a Large World system, chances are you probably have at least 50k objects in your game that are part of the physics world in some way.

Even today this is an issue. Every house, every structure, every enemy, every tree, every apple you see has to live inside the physics system, where it takes up space in the BVH tree. Each physics area and body means physics lookups are that much slower because the tree has that much more depth in order to contain everything.

There are multiple ways to address this that do not have with having the rest of the codebase dealing with nodes added and removed from the world based on the camera position, starting with trying to optimize the structure itself so that queries in one side of the world do not affect the other side.

This is literally what a BVH tree does.

Nodes that have visual data to be unloaded could also remove the physics data from the physics system - after all if a game is fine with nodes getting removed based on a camera position would also have to deal with physics getting removed too. Physics would have to deal with that either way.

The solution you describe would be far more complex than just "make sure the node knows about getting spawned/despawned". Now we have to consider specific subcomponents of nodes that would have to deal with parts of them being removed. It's much simpler to just remove the entire node. Both accomplish the same thing, but removing the node is easier.

Imagine if Breath of the Wild kept everything spawned in at once! First of all, I'm not even sure it'd all fit in memory, and even if it did, the game would run like a slideshow from all the required bookkeeping.

I do not have access to the source code of BotW's system to know how it works, so i can't tell what exactly it keeps in memory and what goes away.

No, but even without code access, this is a common area of study by the speedrun community. I guarantee that BotW is dynamically spawning and despawning entities in some way. Otherwise the overhead from having all that stuff in the tree would be far too heavy for a world of that size.

That's half the reason why all these small detailed objects are streamed in and out as you move through the world.

Streaming small detailed objects is not the same as streaming the data needed for those objects. You could unload mesh and textures and keep everything else.

The mesh and textures aren't the bottleneck here.

The crux of the issue seems to be "despawned objects forget their state"

No, the crux of the issue is that adding and removing nodes will create all sorts of problems all over the codebase in places where nodes are handled which will create a cascade of bugs that in isolation might seem simple to fix but overall will affect everything, including the game logic people who make games in Godot will write (and even reusable plugins that need to deal with nodes, so that complexity will spread even to games that do not need streaming in the first place).

This proposal is about the creation and deletion of nodes in a Large World. If you remove that, you remove the very crux of this proposal. The alternative you're suggesting is closer to a streaming LOD system for assets. That's fine, and it'd be useful to have in the engine, but that's not what this proposal is about.

From the Godot Contributors Chat, @jonbonazza adds:

also, having the node in the tree means its _notification method will be called—even if it doesnt do anything, thats still overhead. godot is know to have issues with a large number of nodes in general. the Resources consuming memory is only one part of the larger problem. the other part is the performance cost of nodes being in the tree

And he's right. This has been a pain point in Godot historically. It's a pain point in any engine really. This is why we're trying to find clever solutions. Yes, there are edge cases. Yes, it's more work. But in the end, 99% of the time you don't need to have anything a mile away part of the gameplay simulation at all, so it shouldn't be spawned in in the first place. And by eliminating that overhead, you end up with something that runs much better on less powerful machines (or, more importantly, a game that runs at all on state-of-the-art hardware).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants