-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Cesium3DTilesetStatistics improvements #12974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Cesium3DTilesetStatistics improvements #12974
Conversation
Thank you for the pull request, @Beilinson! ✅ We can confirm we have a CLA on file for you. |
I can fix the failing tests, but only after confirmation that the updated visitation numbers are desired. Correctness is based on what visited means:
|
@Beilinson nice performance improvement! Just a drive by comment, I haven't reviewed the code
I would stay with option 1 - a tile is visited when it is added to the traversal queue. |
Thanks @lilleyse!
Thats fair. I worry this is minor misdirection as the count is much smaller than the true number of updated tile. Someone attempting to improve the performance of either of the traversals would be incorrect to base his performance assumptions off of that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this a little hard to actually reproduce in the devtools profiler. Eventually I made this small sandcastle which automates starting and stopping the profiler around running a camera sequence.
I do think this is an improvement but it's also not one of the biggest performance impacts from what I can see. I often see the time way lower than shown in the initial post on my machine.
That's fair, just wanted to clear up that I tested this using 20x cpu slow down to simulate "high resolution" profiling since the regular performance sample rate can easily miss something like this |
Given that the Usually I cannot really profoundly justify why I didn't use a (The fact that there are probably hundreds of other places that could be changed from |
I actually did some local testing by converting as many |
The performance is one thing. When you talk about a Proxy, then I wonder: If this is a real However, I assume that the performance would generally not become worse. That remains to be confirmed, but from what I've heard, a So regardless of the performance, I think that the clarity of the code is really important. First of all, I don't like the untypedness of JavaScript to begin with. The fact that you can just BTW:
It was exactly two weeks ago (October 1st ) when I asked in our internal chat...
The bottom line: |
Also a drive by comment: I don't think at this point there's reason not to move to For Though there's no reason to prevent this PR from going in with just this one |
And I believe doing so also can have performance implications if "dictionary mode" is triggered. |
@Beilinson I've had a long-running note to myself to get rid of As @ggetz mentioned, we do have to communicate any change clearly, since it's public. Maybe replacing |
Happy to get conversation (re)started about this! Is this PR ready to merge then? |
result.texturesReferenceCounterById = { | ||
...statistics.texturesReferenceCounterById, | ||
}; | ||
result.texturesReferenceCounterById = statistics.texturesReferenceCounterById; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems Github ate this comment and didn't post it with my original review. This is my main remaining concern with this PR that it's no longer a full clone
What do you think about cloning the map here instead of just passing by reference? In some limited testing it seemed like it was still a big improvement over the spread ...
but also stays more true to the meaning of clone
by actually cloning?
result.texturesReferenceCounterById = statistics.texturesReferenceCounterById; | |
result.texturesReferenceCounterById = new Map(statistics.texturesReferenceCounterById); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just compared all three options using your sandbox and CPUPROFILE_FREQUENCY=1000
, the results:
Clone with new Map (this suggestion):
No deep clone (my version):
Don't have anything to show here because it doesn't show up on the performance sample
This change makes it no longer a "deep clone", it still is a "shallow clone", and I dont think the deep behavior is needed for a completely internal part of this class which has zero effect on any part of the external runtime or the end result when comparing statistics before/after rendering. IMO, even if its only 4% of total cpu time, thats still 4% that has no added value on the outcome of the statistics counting.
This element seems to only be used by the new version (the clone) and never the old object, the actual texturesByteLength
which this is attribute is used to update is copied over.
For context, at 4% this unnecessary clone takes more time than Matrix4.multiplyTransformation or binding the vertices to webgl:

4% rendertime isn't critical, but I think if many of these small percentage improvements can be made that will make a big long-term difference.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjspace You mentioned the cloning aspect in the now-closed PR at #12968 (comment) (but that referred to the credits, so I'm not sure if this is what you meant...?)
I'm also in strong favor of avoiding surprises: clone
should clone (no shallow copy with unpredictable side-effects).
But... I had a short look at where Cesium3DTilesetStatistics.clone
is actually called and how it is used.
One call is in update
(every call is in some update
😆 ). It's not really documented what's happening there. But it seems to fill some tileset._statisticsPerPass
array. This array, in turn, does not seem to be used anywhere.
The other call is in raiseLoadProgressEvent
. There, it fills some tileset._statisticsLast
. But after that, it only seems to care about the numberOfPendingRequests
and numberOfTilesProcessing
there (and certainly not about that map with the texture IDs).
Sooo... unless I'm overlooking something, the places that are using that clone
function are doing a lot of unnecessary work to begin with, and on top of that, none of them seems to depend on that Map
. I'm sure there is some room for improvement.
One point that might be relevant for real peformance comparisons: What's in that Map
after all? The performance will to some extent depend on the size of that map. On the other hand: When there are 10 textures, then the map is small, and when there are 1000 textures, then there is other costly stuff - so I'd expect this to not be a bottleneck either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javagl The map gets pretty big (hundreds of entries) of [string-number]
pairs.
I checked the instances of the statistics cloning and saw the same as you reported. Either way, the clone is used more as a "Lets get a snapshot of what the statistics currently looks like", wherein the map is not useful.
_statisticsPerPass
is used in the specs, and also in the Cesium3DTilesInspectorViewModel.getStatistics
. Here all the internal properties are needed except the Map
again, because these act as snapshots of what the statistics looked like while processing the pick/render/whatever pass, and aren't actively used for calculating anything.
Ideas:
- Make this a separate function (
.snapshot
or open to other suggestions) that only passes around these values without copying the map. It would return an object that doesn't have the functions to increment/decrement the values, representing that its a static snapshot object that shouldn't be worked on directly. - Explicitly copy over only what is needed, so
_statisticsLast
would become an object containing onlynumberOfPendingRequests
andnumberOTilesProcessing
, and the passes would copy over everything except the map. I think the snapshot model is cleaner.
thoughts @javagl @jjspace ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed an example of what the snapshot would look like, note that the passes and last statistics now are no longer full statistics objects, so they dont contain functions to increment/decrement counts or any other logic, just container objects. They also don't contain the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion, but a few random comments for now:
- When something is only used in tests, it's a hint that it may be removed
- I didn't have the
Cesium3DTilesInspectorViewModel
on the radar (usually only searching in./packages/engine/Source
...) - The whole structure and role of the statistics raises a bunch of questions
- Top-level ones: Which of them are purely informative (to be shown in a UI), and which of them are crucially influencing the behavior of the whole application? I think that the
_statisticsPerPass
could/should carry a comment saying that it's only for the UI. And for me, the inlined "block comments",// Rendering statistics
,// Loading statistics
... etc. are screaming that there should beRenderingStatistics
andLoadingStatistics
sub-structures.
- Top-level ones: Which of them are purely informative (to be shown in a UI), and which of them are crucially influencing the behavior of the whole application? I think that the
The 'snapshot' approach looks conceptually clean for me, but implies new structures (first and foremost that new type, ...Snapshot
) and several changes that may warrant some explaining (maybe even just /** A snapshot is a shallow copy ... */
or so).
If this was only about the raiseLoadProgressEvent
, I could imagine that storing both relevant fields explicitly, as
tileset._lastNumberOfPendingRequests
and _lastNumberOTilesProcessing
could be an option a well. (Yes, this "litters" the tileset even more, but ... now there's that statisticsLast
which carries mostly unused stuff, so both are not ideal anyhow...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about statisticsLast (although future logic may want to use the other attributes for certain logic maybe).
But yes it's both used for specs (which in my opinion would preferably be something hidden by a debug flag) but also for the inspector view to debug either the pick/render pass (from what I saw nearly every member other than the map is used).
I would prefer not to stretch the structural changes in this PR much more than I have already, but let me know what you think is best, just comment the areas for future reference or something different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, no strong opinion (I assume others will chime in here), but
- Change to
Map
: Good - Shallow copy instead of clone? Confusing
- The changes from the last commit ("snapshot") go a bit too far and raise too many questions for me
A compromise:
Iff the performance benefit is really worth it, maybe just renaming the clone
to shallowCopy
could be reasonable? (It's an undocumented/internal function after all.. )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im fine with that, shallowClone
+ comment explaining that the clones should not be worked on directly sounds good to me.
I could also set the map to undefined
so if in the future someone does try to use one of the clones it shoulderror out.
thoughts @jjspace /and or others?
We definitely got a little off track. All good discussion but it can be moved into that issue you just opened. I agree with @ggetz that there's no reason to hold up this specific switch to |
Description
What I did: loaded the basic
Google Photorealistic
sandcastle and profiled me just moving around slowlyPerformance Profile:

I ordered by Self Time descending, and saw that this
Cesium3DTilesetStatistics.clone
takes about 8% of total frame time.Why this change:
The performance profile showed that this object destructure was taking up the main bulk of the time of this function.
I audited the code, and saw that while the statistics object is copied frame-to-frame, this property specifically is only ever used by the new statistics object, and passing it by reference kept all the same behavior in prod. Changing it a
Map
is also a minor performance improvement.Additionally, I noticed that the actual amount of calls per frame to
Cesium3DTilesetTraversal.updateVisibility
was higher than the reported number of visited tiles, because some tiles are being updated as part of a check by their parents and are never added to the tileset traversal queue, so never marked as visited. This is rectified by explicitly visiting the tile inupdateVisibility
.Current Reported Visited:
After this change:
Testing plan
Compare inspector results between main sandcastle and here, texture sizes should be the same. Visited tiles should be bigger.
main
local
Author checklist
CONTRIBUTORS.md
CHANGES.md
with a short summary of my change