Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting is difficult / impossible to control for particles with transparency #43

Open
mbalfour-amzn opened this issue Apr 18, 2023 · 5 comments
Assignees

Comments

@mbalfour-amzn
Copy link

Short version:
In the MultiplayerSample project, we've run into multiple particle sorting issues, and there doesn't seem to be a way to work around or fix it. This is a request to at a minimum provide a way to "unbatch" particle emitters for specific emitters to make it possible to fix sorting issues when problems arise.

Long version:
Atom sorts transparent draw calls first by sort key, and then by depth. By default, everything uses a sort key of 0. PopcornFX is also currently hard-coding a sort key of 0 in CPopcornFXFeatureProcessor::Render(). In the case of MultiplayerSample, we need depth sorting since the particles appear in a level that has translucent walls everywhere, so this is OK. It would also be nice if sort key controls were exposed instead of hard-coded, but that wouldn't help for this situation anyways.

The problems show up due to particle batching. The particle effects we're using have multiple layers to them, and each layer has different-sized particles. For simplicity, let's just say there are 2 layers, one with a 1 meter particle and one with a 5 meter particle. We also reuse the effects multiple times around the level, so the same effect can draw hundreds of meters apart. These two factors are affecting the sorting in negative ways:

  1. The 2 layers of the particle effect draw in inconsistent orders relative to each other. Each layer is batched separately and contains a bounding box affected by the size of the particles, so even though both layers start with the same emitter centers, the bounding boxes become different sizes due to the size of the particles, which means that the centers can have different depths to them. This will change which layer sorts in front of the other layer when drawing, because even if the emitter centers create a 100 m bounding box, the bounding box of the two layers will be 101 m and 105 m, so they will have slightly different center points that will sort differently depending on camera location. The following video demonstrates the problem where the light blue swirl briefly sorts in front of the dark blue center depending on the overall locations of multiple particle systems relative to the camera.
Editor_AOtsrn9TFR.mp4
  1. A batched set of particle effects only has one depth value that's used to sort it vs the world. Consequently, all of the effects need to appear on the same side of a second transparent item in the world or else they will sort incorrectly. The following images demonstrate the problem. There's no way to make the two different effects sort with different depth values so that one appears in front of the window and the other appears behind it.

image
image
image

@HugoPKFX
Copy link

Hi, thanks for the detailed explanation

  1. This is usually fixed using the Camera Sort Offset of renderer nodes. If the O3DE plugin doesn't support it, it's a bug, we'll look into it. How it works is PKFX just offsets draw call bboxes along the camera forward axis (world unit offset)

  2. This is the main issue with PopcornFX batching sim units. Internally PopcornFX does not simulate these two emitters separately (simplifying it to one one layer in that effect for this example). Particles are evaluated in "waves" of 512/1024/.. particles depending on the platform (wave size can be specified per layer). Because of that, the memory layout of particle storage (positions/sizes/..) needed for rendering isn't location dependent but spawn-order dependent (positions of emitter A and B are stored in the same positions array, depending on particle spawn order).

We built on top of this two systems:

  • Localized pages: realtime algorithm splitting waves depending on particles world positions, producing smaller & more localized (as in world location) waves. This helps LOD (built-in particle LOD allowing to reduce execution frequency of pages) of environmental effects or effects with several instances across a world. This algorithm uses an heuristic for splitting waves but it's not a fully satisfying solution
  • Draw calls slicing: draw call data is built by (this depends on various parameters, but most of the time that is the case) by direct copies from the source particle data (positions, sizes, ..) into the destination gpu buffers. As the source data isn't stored in a per-emitter fashion, this causes these big bounding boxes. Draw calls slicing determines a maximum number of slices per draw call (let's say, 10). A draw call is then sliced up to 10 times along the view forward axis. This produces smaller draw calls, however the bounding box isn't re-built for these slices (it would require re-walking the index buffer) so we submit to the engine the global draw call bounding box, but PopcornFX SDK provides these draw calls already sorted back to front. We are investigating this in O3DE as a possible solution to this problem, possibly offsetting the draw call bounding boxes somehow
  1. Is PopcornFX main issue sorting wise, some titles shipping with slicing enabled and in most cases it helped, but it definitely isn't a bulletproof solution. There will always be a case where this won't work properly, where the slicing heuristic would need to be tweaked to the game specifics. We thought about draw calls slicing using the emitter ID, but this can quickly become hardcore performance wise, as two emitters being close to each other, with particles from each overlapping could generate thousands of draw calls. Some other titles using PopcornFX completely removed CPU sorting and used OIT, that can be a solution too but it also has its share of limitations.
    The other option is to rewrite billboarding code so instead of slicing draw calls downstream, GPU buffers are filled per instance directly, but this will come at a cost obviously as instead of doing memcpy (for the most part) of source particle data, we'll have to iterate on each particle to determine what is the target GPU buffer. Also, one set of GPU buffers will be required per emitter in the level (or this could be a global buffer with different offsets)..

Other solutions to mitigate sorting issues:

  • A custom simulation interface could be called in the nodegraph, cull-testing a particle against all views from the game. This would greatly reduce bounding boxes in a level.
  • Emitters outside the view can be "SetVisible(false)", also reducing draw call bounding boxes

We'll let you know our findings with slicing in the O3DE plugin

@HugoPKFX
Copy link

HugoPKFX commented May 24, 2023

Hi,

After investigating this issue, this is what I found:

As you can see in the following image, without talking about PKFX not correctly sorting with O3DE transparent objects, we can see the smaller sphere in this image that is technically behind, is drawn on top:
image

This is due to this effect using a "sort by custom value" with a key that is invalid:
image

Reverting the effect to use CameraDistance sort instead, I suspect the artefacts seen in video (1) will be resolved. It is also possible to mitigate further with a CameraSortOffset value different on some renderers from that effect.

About (2), PR #54 re-enables experimental slicing in the plugin, here are some initial results:
https://github.com/PopcornFX/O3DEPopcornFXPlugin/assets/15339931/3f4abd49-1020-4622-b042-1a1e450e2bc0

There's definitely some flickering, and sorting result will differ depending on the camera angle. Once the PR is merged, it would be interesting to test this in the level, and see if slicing helps. The slicing algorithm can also be customized in the Gem so it's possible to try various approaches. One of the main problem is within the slicing code itself, it currently ignores game engine geometry:

  • Produce slices per draw batch
  • Sort all slices from all draw batches - view distance of bbox's center
  • Walk all slices, merge all contiguous ones from the same render batch to reduce final draw call count

If we were to consider slicing to be working properly with all O3DE transparent geometry, the merge pass would need to take into account both PKFX slices & O3DE transparent objects draw calls somehow (ideally, before the draw packets are built to reduce overhead of rejected/merged slices)
We could also disable the draw calls merge pass, but that would mean that for each draw batch, there can be up to p_PopcornFXMaxSlices sliced draw calls..

Food for thought, let's see the results of #54 once merged (by default, slicing is disabled).

@mbalfour-amzn
Copy link
Author

Wow, that's some great progress! Slicing controls definitely seem like they'll go a long way towards giving the designers some ability to tune performance vs sorting correctness. One other possible idea for slice controls would be exposing some sort of "manual slice bins" on each emitter, where by default they would auto-group together into one or more auto-binned slices, but if there are a few problematic emitters on the outskirts of the level or something, they could be manually placed into their own slice.

Definitely looking forward to seeing the results of this change! Are you also going to submit a change to the effect itself to correctly use CameraDistance, or is that something that someone else (me?) needs to do?

@HugoPKFX
Copy link

HugoPKFX commented May 24, 2023

Wow, that's some great progress! Slicing controls definitely seem like they'll go a long way towards giving the designers some ability to tune performance vs sorting correctness. One other possible idea for slice controls would be exposing some sort of "manual slice bins" on each emitter, where by default they would auto-group together into one or more auto-binned slices, but if there are a few problematic emitters on the outskirts of the level or something, they could be manually placed into their own slice.

Yes the issue is particles positions/sizes/colors/.. aren't written into GPU buffers based on their world locations but based on their spawn order. So we're left with slicing the index buffer basically, but this will be a good thing to experiment with the slicing algorithm and expose additional CVARs that can help fix problematic situations

In this specific example, the dark/marine-blue orb could very well be alpha tested particles, they seem to be alpha blended with an alpha set to 1..

Just saw another artefact in the video:
image

Do the gems have a transparent material too ?

Definitely looking forward to seeing the results of this change! Are you also going to submit a change to the effect itself to correctly use CameraDistance, or is that something that someone else (me?) needs to do?

I'll take a look asap and push a fix for this effect on the MPS repo, I also saw another effect (I don't remember the name) that doesn't play where spawned but at world origin (one of the effects when firing). Will do a small cleanup pass by the end of the week

@HugoPKFX
Copy link

HugoPKFX commented May 25, 2023

You can find two PRs here:
o3de/o3de-multiplayersample#421
o3de/o3de-multiplayersample#422

Once these are merged (and #54) we'll take a look at the result of slicing in the level

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants