Skip to content

Commit

Permalink
SSR: Added documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
MikhailGorobets committed Jan 22, 2024
1 parent 7147865 commit dd5095d
Show file tree
Hide file tree
Showing 11 changed files with 148 additions and 0 deletions.
148 changes: 148 additions & 0 deletions PostProcess/ScreenSpaceReflection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Screen Space Reflection
![alt text](media/ssr-logo.jpg "A screenshot showing the final composite of the SSR reflections into a scene.")


## Table of contents
- [Introduction](#introduction)
- [Motivation](#motivation)
- [Supported platforms](#supported-platforms)

- [Integration guidelines](#integration-guidelines)
- [Input resources](#input-resources)
- [Normal buffer configurations](#normal-buffer-configurations)
- [Depth buffer configurations](#depth-buffer-configurations)
- [Providing motion vectors](#providing-motion-vectors)
- [Host API](#host-api)
- [Implementation details](#implementation-details)
- [Algorithm structure](#algorithm-structure)
- [Preparation for ray tracing](#preparation-for-ray-tracing)
- [Blue noise texture generation](#blue-noise-texture-generation)
- [Hierarchical depth generation](#hierarchical-depth-generation)
- [Stencil mask generation and roughness extraction](#stencil-mask-generation-and-roughness-exraction)
- [Ray tracing](#ray-tracing)
- [Denoising](#denoising)
- [Spatial reconstruction](#spatial-reconsturction)
- [Temporal accumulation](#temporal-accumulation)
- [Cross-bilateral filtering](#cross-bilateral-filtering)
- [Possible improvements](#possible-improvements)
- [References](#references)

## Introduction
### Motivation
We needed to add screen space reflection to our project with the following requirements:
- Compatibility with WebGL
- Supports rough surfaces
- The execution time should not exceed 2ms at Full HD resolution on devices equivalent to RTX 2070.

We used AMD's implementation of Screen Space Reflection as a basis for our implementation **[AMD-SSSR]**
I strongly recommend to read AMD documentation and also a more detailed review of the algorithm by Kostas Anagnostou **[Kostas Anagnostou, SSSR]** for understanding the subsequent text.
Unfortunately WebGL doesn't support compute shaders, so we had to make some compromises for compatibility. Please refer to [implementation details](#implementation-details) section for further insights.

### Supported platforms
- Windows 10+
* Vulkan® 1.x
* DirectX® 11
* DirectX® 12
* OpenGL®
- Emscripten
* WebGL®
- Android 14.x
* Vulkan® 1.x

## Integration guidelines

## Implementation details

### Algorithm structure
The algorithm can be divided into three main stages:
1. [Resource Preparation](#preparation-for-ray-tracing): Preparation of resources for the ray tracing step.
- [Blue noise texture generation](#blue-noise-texture-generation)
- [Hierarchical depth generation](#hierarchical-depth-generation)
- [Stencil mask generation and roughness extraction](#stencil-mask-generation-and-roughness-exraction)
2. [Ray Tracing](#ray-tracing): The actual ray tracing stage.
3. [Denoising](#denoising): Denoising the image obtained during the ray tracing stage.
- [Spatial reconstruction](#spatial-reconsturction)
- [Temporal accumulation](#temporal-accumulation)
- [Cross-bilateral filtering](#cross-bilateral-filtering)

#### Blue noise texture generation

Blue noise sampling |
:----------------------------:|
![](media/ssr-blue-noise.png) |


[**ComputeBlueNoiseTexture.fx**](https://github.com/DiligentGraphics/DiligentFX/blob/master/Shaders/Common/private/ComputeBlueNoiseTexture.fx)

#### Hierarchical depth generation
The hierarchical depth buffer is a mip chain where each mip level pixel is the minimum(maximum for reseved depth) of the previous level’s 2×2 area depths (mip 0 corresponds to the screen-sized, original depth buffer). It will used later to speed up raymarching but can also used in many other techniques, like GPU occlusion culling.

Check failure on line 78 in PostProcess/ScreenSpaceReflection/README.md

View workflow job for this annotation

GitHub Actions / Linux -> Pre-checks

reseved ==> reserved

Depth mip chain |
:-----------------------------------:|
![](media/ssr-hierachical-depth.jpg) |

We recommend reading this article **[Mike Turitzin, Hi-Z]**, because computing a hierarchical buffer for resolutions not divisible by 2 is not so trivial. The original AMD algorithm uses SPD **[AMD-SPD]** to convolve a depth buffer. SPD allows us to compute it for a single **Dispatch** call, since we can't use compute shaders we use a straightforward approach.
We calculate each mip level using a pixel shader [**ComputeHierchicalDepthBuffer.fx**](https://github.com/DiligentGraphics/DiligentFX/blob/master/Shaders/PostProcess/ScreenSpaceReflection/private/ComputeHierarchicalDepthBuffer.fx), using the previous mip level as an input.

#### Stencil mask generation and roughness extraction
The original algorithm work starts with a classification pass (**ClassifyTiles**). This step writes pixels will participate in [ray tracing step](#ray-tracing) and subsequent [denoising stages](#denoising) to a global buffer(Our denoiser differs from the AMD implementation, but the underlying idea remains the same).
The decision of whether a pixel needs a ray or not is based on the roughness; very rough surfaces don't get any rays and instead rely on the prefiltered environment map as an approximation.
Once this done we are (almost) ready to to ray march, the only problem is that we don’t know the size of the global array of pixels to trace on the CPU to launch a Dispatch. For that reason, the technique fills a buffer with indirect arguments, with data already known to the GPU and uses a `DispatchIndirect` instead. The indirect arguments buffer is populated during the **PrepareIndirectArgs** pass. Nothing particular to mention here apart from that it adds 2 entries to the indirect buffer, one for the pixels to trace and one for the tiles to denoise later.

Unfortunately, we cannot use compute shaders and dispatch indirect, as I mentioned earlier, so we have made compromises. We use a stencil mask to mark pixels that should participate in subsequent calculations. To do this, we first clear the values of all pixels in the stencil buffer to `0x0`, and then we run [**ComputeStencilMaskAndExtractRougness.fx**](https://github.com/DiligentGraphics/DiligentFX/blob/master/Shaders/PostProcess/ScreenSpaceReflection/private/ComputeStencilMaskAndExtractRoughness.fx) with stencil test enabled for writing. If the roughness of the current pixel is less than `RoughnessThreshold`,
we write the value `0xFF` to the stencil buffer; otherwise, the stencil buffer retains its previous value of `0x00`. In subsequent steps, we enable stencil test for reading with the `COMPARISON_FUNC_EQUAL` function for the value `0xFF`. Simultaneously with writing the stencil test, we write the roughness to a separate render target. The separate texture allows us to simplify the code for roughness sampling in subsequent steps of the algorithm and theoretically should slightly improve performance.

Stencil Mask for SSR | Final renderend image with SSR
:---------------------------:|:-------------------------:
![](media/ssr-stencil-0.jpg) | ![](media/ssr-stencil-1.jpg)


### Ray tracing

Spacular part of rendering equation |
:----------------------------------------:|
![](media/ssr-rendering-equation.jpg) |


Split sum approximation |
:----------------------------------------:|
![](media/ssr-split-sum-approximation.jpg)|


Invalid ray |
:----------------------------------------:|
![](media/ssr-unoccluded.png) |


Hiearchical depth buffer traversal |

Check failure on line 117 in PostProcess/ScreenSpaceReflection/README.md

View workflow job for this annotation

GitHub Actions / Linux -> Pre-checks

Hiearchical ==> Hierarchical
:----------------------------------------:|
![](media/ssr-hierarchical-traversal.gif) |

### Denoising
#### Spatial reconstruction
#### Temporal accumulation
#### Cross-bilateral filtering
## Possible improvements
* Add support for reserved depth buffer
* Add support for compressed normal map
* Use the previous frame as input radiance to the [ray tracing stage](#ray-tracing). This will add multiple reflection and also simplify render architecture
* Add dynamic resolution for the raytracing stage, which will increase performance on weaker GPU
* [Spatial reconstruction step](#spatial-reconsturction) uses screen space to accumulate samples. Try to perform accumulation in world coords, this should reduce bias
* We can also try calculating direct specular occlussion in the [ray tracing step](#ray-tracing)
* The bilateral filter does not have the separability property, so it will have poor performance on large kernel dimensions. Consider replacing by [Guided Image Filtering](#https://kaiminghe.github.io/eccv10/index.html since this algorithm has this property
* Current implementation of hierarchical ray marching has a [problem](https://youtu.be/MlTohmB4Gh4?t=762). We have to try to fix it

## References

- **[AMD-SSSR]**: FidelityFX Stochastic Screen-Space Reflections 1.4 - https://gpuopen.com/manuals/fidelityfx_sdk/fidelityfx_sdk-page_techniques_stochastic-screen-space-reflections/
- **[AMD-SPD]**: FidelityFX Single Pass Downsampler -
https://gpuopen.com/manuals/fidelityfx_sdk/fidelityfx_sdk-page_techniques_single-pass-downsampler/
- **[EA-SSRR]** Frostbite presentations on Stochastic Screen Space Reflections - https://www.ea.com/frostbite/news/stochastic-screen-space-reflections
- **[EA-HYRTR]** EA Seed presentation on Hybrid Real-Time Rendering - https://www.ea.com/seed/news/seed-dd18-presentation-slides-raytracing
- **[Eric Heitz, VNDF]** Eric Heitz' paper on VNDF - http://jcgt.org/published/0007/04/01/
- **[Eric Heitz, Blue Noise]** Eric Heitz' paper on Blue Noise sampling - https://eheitzresearch.wordpress.com/762-2/
- **[Kostas Anagnostou, SSSR]** Notes on Screen-Space Reflections with FidelityFX SSSR -
https://interplayoflight.wordpress.com/2022/09/28/notes-on-screenspace-reflections-with-fidelityfx-sssr/
- **[Thorsten Thormählen, IBL]** Graphics Programming Image-based Lighting -
https://www.mathematik.uni-marburg.de/~thormae/lectures/graphics1/graphics_10_2_eng_web.html#1
- **[Mike Turitzin, Hi-Z]** Hierarchical Depth Buffers - https://miketuritzin.com/post/hierarchical-depth-buffers/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit dd5095d

Please sign in to comment.