Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Performance: Modify bounds detection to inheritance when clipping is …
…disabled (#365) This PR changes the CoreNode update behavior, including how bounds detection works. Instead of creating local strictBoundaries and preLoad boundaries, the boundaries are inherited from the parent unless clipping is enabled. # Why? Bounds detection added a significant CPU impact to the the rendering loop of the L3 Renderer, causing low end devices to struggle to keep up as the CPU would be overloaded and not in time enough to provide new render instructions to the GPU. Bounds detection is needed to ensure we only draw what is required on screen / view port and do not render nodes that are outside of the view ports bounds. # What changed? Previously every node would on every `update()` calculate what its `strictBound` and `preloadBounds` where based off of its world position. This is quite expensive to do and most of time time not needed unless clipping is enabled on a particular node. This changes: - `strictBounds` and `preloadBounds` are inherited from the parent, if the parent has no bounds the viewport stage bounds are used. - If a node enabled clipping, only that parent will calculate it's own `strictBound` and `preloadBound` for it's own children - This is calculated once for all children - Don't process anything that's not needed when out of bounds Unrelated to bounds but performance changes that I ran into: - Only run through clipping if clipping is enabled, previously clipping was executed every global Transform (== expensive) - Minor conditional checks and small performance gains # Test results Prior to the PR I ran two tests, a `CoreNode.update()` throughput test (using #364) and a stress benchmark with bounds for FPS measurements. Tested on a Ryzen 7 6800H / 3070 win11 machine using Chrome Version 128.0.6613.113 (Official Build) (64-bit) on **20x slowdown** ## Baseline **FPS** --------------------------------- index.ts:292 Average FPS: 30.33 index.ts:293 Median FPS: 31 index.ts:294 P01 FPS: 20 index.ts:295 P05 FPS: 25 index.ts:296 P25 FPS: 29 index.ts:297 Std Dev FPS: 2.8001964216818784 index.ts:298 Num samples: 100 index.ts:299 --------------------------------- **Throughput** ``` ┌───┬───────────┬─────────┬───────────────────┬────────┬─────────┐ │ │ Task Name │ ops/sec │ Average Time (ns) │ Margin │ Samples │ ├───┼───────────┼─────────┼───────────────────┼────────┼─────────┤ │ 0 │ update │ 126 │ 7899612.499999996 │ ±2.44% │ 64 │ └───┴───────────┴─────────┴───────────────────┴────────┴─────────┘ ``` ## These changes **FPS** --------------------------------- index.ts:292 Average FPS: 40.73 index.ts:293 Median FPS: 41 index.ts:294 P01 FPS: 33 index.ts:295 P05 FPS: 35 index.ts:296 P25 FPS: 39 index.ts:297 Std Dev FPS: 3.3700296734598636 index.ts:298 Num samples: 100 index.ts:299 --------------------------------- **Throughput** ``` ┌───┬───────────┬───────────┬───────────────────┬────────┬─────────┐ │ │ Task Name │ ops/sec │ Average Time (ns) │ Margin │ Samples │ ├───┼───────────┼───────────┼───────────────────┼────────┼─────────┤ │ 0 │ update │ 4,295,328 │ 232.8110762198926 │ ±1.28% │ 2147665 │ └───┴───────────┴───────────┴───────────────────┴────────┴─────────┘ ``` About ~10 FPS on 20x slowdown and going from 126 to 4k ops in throughput.
- Loading branch information