vsg::Intersector wasted time #1588

AnyOldName3 · 2025-10-08T16:48:25Z

AnyOldName3
Oct 8, 2025

I've been working on making vsg::LineSegmentIntersector faster for a client to allow faster terrain height queries for 3D Tiles data, and have been profiling a test app that does a large number of queries to see where time is spent. The VSG part is available at https://github.com/AnyOldName3/VulkanSceneGraph/tree/fast-intersection-queries, and the test app is available at https://github.com/AnyOldName3/vsgExamples/tree/terrain-height-example. Parts of the implementation are still temporary, so there's no pull request yet.

I've already made things much faster by creating an optional acceleration structure for vsg::VertexDraw so instead of testing for intersection with every triangle, only the ones which have a high chance of being near a line are tested, and I've reduced the amount of time spent allocating intersector members by making it possible to reset an intersector and use it to query a new line and/or scenegraph.

However, there's still a significant amount of time spent on things that I can't optimise without breaking changes of one kind or another, so before I jumped in, I thought I better mention these here.

Nested allocations

Even if we reuse an intersector, so don't have to redo the direct allocations for intersections at a similar scenegraph depth, we've got to destroy and recreate the storage that some intersector members own.

arrayStateStack holds multiple ArrayState instances, which contain:
- localToWorldStack and worldToLocalStack - as far as I can tell, these are only there so that the intersectors can intersect billboarded geometry, but we could get the same result by making these stacks part of vsg::Intersector and passing in the top nodes when asking the array state for the vertex buffer. I'm wondering if there's any reason not to make this change? Also, these stacks aren't set by vsg::ComputeBounds, which probably means that things with custom array state give incorrect bounds.
- vertices and proxy vertices. These aren't a big problem as they share ownership of storage, and cloning array state just copies pointers.
- arrays, a vsg::DataList. Maybe this can be allocated in place if it's changed to just be a single ref_ptr<Data> for the basic ArrayState class, two ref_ptr<Data> fields for TranslationArrayState etc. so we only need to track the (fixed number of) attributes we care about.
intersections can hold multiple Intersection instances, which contain:
- nodePath, which is important when an app wants to know the node path, so can't be avoided unless there's some way to opt out.
- arrays, still a DataList. I don't see the point as it can be computed from the node path by applications that need it without making applications that don't have to pay the cost of it.
- indexRatios, which could be a std::array<IndexRatio, 3> if we only cared about triangles. Vulkan only directly supports triangles (and other topologies made of triangles), except when tesselation is used, in which case, you can define patches with arbitrarily many control points (before turning them into triangles in the shader). I guess we could optimise the common case while letting complicated tesselation-based things still do what they wanted if the field became a std::variant<std::array<IndexRatio, 3>, std::vector<IndexRatio>> so triangles work in-place, but higher-order patches can use the heap.
For applications that only care about the first intersection, all of these could be made into non-nested allocations by just overwriting a single Intersection instance.
arrayStateStack and intersections both hold ref_ptrs instead of directly holding the objects, so there's a mandatory allocator invocation each time anything's added to these containers even if growing the containers avoids allocation. I'm not seeing an obvious reason for Intersection to inherit from vsg::Object, and if it didn't, there'd be no need to have the layer of indirection, but ArrayState is used elsewhere, and we don't know the specific runtime type anyway, so it has to be a heap-allocated pointer (until such a time as C++ gains the ability to use something like C's variable-length arrars to stack-allocate an object of runtime-knowable type).

Secondary intersections

A lot of applications will only care about the closest intersection. It wastes time to compute intersections we don't care about and sometimes we can know we don't care because a node's bounds showed the whole thing was beyond the best-known intersection so far. Also, as mentioned above, it means we need to redo any allocations an intersection instance owns.

The particular use case I've been tasked with optimising isn't troubled by multiple intersections due to the shape of the geometry and direction of the line segments, but a considerable amount of time is spent allocating storage for intersection members, even after switching to reusing intersectors so the storage for the intersection instances themselves only needs allocating once.

Other stuff

At the moment, the most expensive part of intersector traversal (other than the intersection maths and cache misses) is all of the mandatory allocations, so there's nothing else worth optimising yet. It just wouldn't make a meaningful difference until the bigger problems are made smaller.

AnyOldName3 · 2025-11-05T17:58:24Z

AnyOldName3
Nov 5, 2025
Author

arrays, a vsg::DataList. Maybe this can be allocated in place if it's changed to just be a single ref_ptr<Data> for the basic ArrayState class, two ref_ptr<Data> fields for TranslationArrayState etc. so we only need to track the (fixed number of) attributes we care about.

This proposal isn't viable as something deeper in the scenegraph might care about arrays set higher up, so we need to proactively track everything in case something else cares.

The best alternative I've come up with so far is to use a pool for this vector so when we pop an array state from the stack and push a new one so we can traverse a node's sibling, we can reuse the existing allocation. In theory, it's a pretty elegant and obvious way to make things faster, but actually wiring it up is proving un-beautiful so far.

0 replies

AnyOldName3 · 2025-11-06T18:53:06Z

AnyOldName3
Nov 6, 2025
Author

I've added an extra bullet point about arrayStateStack and intersections being vectors of heap-allocated pointers. I don't see a way to make them both allocation-free. The closest I can think of would be:

stack-allocating a C variable-length array with stategroup.prototypeArrayState->sizeofObject()/arrayStateStack.back()->sizeOfObject(). This would only work on non-standard-compliant compilers that expose C VLAs in C++ mode as a language extension.
using placement new to construct the cloned array state into that VLA.
futzing with its reference count so it doesn't self-destruct.
adding the pointer to the stack-allocated instance to arrayStateStack.

This would be:

ugly
not C++ (unless VLAs became part of a future standard, but that's vanishingly unlikely given that they've been part of C for over a quarter century and the C++ comittee still thinks they were a mistake and shouldn't be admitted to C++).

I suppose the only practical way to make this particular aspect faster would be to optimise the VSG allocator itself, e.g. by adding a thread-local FILO free list that memory is released to first before going back into the mutex-protected free memory so code that repeatedly allocated and deallocated same-sized objects wouldn't need to lock a mutex to do so or scan over a bunch of differently-sized memory blocks to find a right-sized one.

1 reply

AnyOldName3 Nov 8, 2025
Author

An extra note while I remember - intersections are a bit faster when ArrayState is forced to allocate via std::malloc instead of vsg::Allocator as it handles freeing and then immediately reallocating same-sized regions more quickly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vsg::Intersector wasted time #1588

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

vsg::Intersector wasted time #1588

Uh oh!

Uh oh!

AnyOldName3 Oct 8, 2025

Nested allocations

Secondary intersections

Other stuff

Replies: 2 comments · 1 reply

Uh oh!

AnyOldName3 Nov 5, 2025 Author

Uh oh!

AnyOldName3 Nov 6, 2025 Author

Uh oh!

AnyOldName3 Nov 8, 2025 Author

AnyOldName3
Oct 8, 2025

Replies: 2 comments 1 reply

AnyOldName3
Nov 5, 2025
Author

AnyOldName3
Nov 6, 2025
Author

AnyOldName3 Nov 8, 2025
Author