-
Notifications
You must be signed in to change notification settings - Fork 54
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
7147865
commit dd5095d
Showing
11 changed files
with
148 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,148 @@ | ||
# Screen Space Reflection | ||
 | ||
|
||
|
||
## Table of contents | ||
- [Introduction](#introduction) | ||
- [Motivation](#motivation) | ||
- [Supported platforms](#supported-platforms) | ||
|
||
- [Integration guidelines](#integration-guidelines) | ||
- [Input resources](#input-resources) | ||
- [Normal buffer configurations](#normal-buffer-configurations) | ||
- [Depth buffer configurations](#depth-buffer-configurations) | ||
- [Providing motion vectors](#providing-motion-vectors) | ||
- [Host API](#host-api) | ||
- [Implementation details](#implementation-details) | ||
- [Algorithm structure](#algorithm-structure) | ||
- [Preparation for ray tracing](#preparation-for-ray-tracing) | ||
- [Blue noise texture generation](#blue-noise-texture-generation) | ||
- [Hierarchical depth generation](#hierarchical-depth-generation) | ||
- [Stencil mask generation and roughness extraction](#stencil-mask-generation-and-roughness-exraction) | ||
- [Ray tracing](#ray-tracing) | ||
- [Denoising](#denoising) | ||
- [Spatial reconstruction](#spatial-reconsturction) | ||
- [Temporal accumulation](#temporal-accumulation) | ||
- [Cross-bilateral filtering](#cross-bilateral-filtering) | ||
- [Possible improvements](#possible-improvements) | ||
- [References](#references) | ||
|
||
## Introduction | ||
### Motivation | ||
We needed to add screen space reflection to our project with the following requirements: | ||
- Compatibility with WebGL | ||
- Supports rough surfaces | ||
- The execution time should not exceed 2ms at Full HD resolution on devices equivalent to RTX 2070. | ||
|
||
We used AMD's implementation of Screen Space Reflection as a basis for our implementation **[AMD-SSSR]** | ||
I strongly recommend to read AMD documentation and also a more detailed review of the algorithm by Kostas Anagnostou **[Kostas Anagnostou, SSSR]** for understanding the subsequent text. | ||
Unfortunately WebGL doesn't support compute shaders, so we had to make some compromises for compatibility. Please refer to [implementation details](#implementation-details) section for further insights. | ||
|
||
### Supported platforms | ||
- Windows 10+ | ||
* Vulkan® 1.x | ||
* DirectX® 11 | ||
* DirectX® 12 | ||
* OpenGL® | ||
- Emscripten | ||
* WebGL® | ||
- Android 14.x | ||
* Vulkan® 1.x | ||
|
||
## Integration guidelines | ||
|
||
## Implementation details | ||
|
||
### Algorithm structure | ||
The algorithm can be divided into three main stages: | ||
1. [Resource Preparation](#preparation-for-ray-tracing): Preparation of resources for the ray tracing step. | ||
- [Blue noise texture generation](#blue-noise-texture-generation) | ||
- [Hierarchical depth generation](#hierarchical-depth-generation) | ||
- [Stencil mask generation and roughness extraction](#stencil-mask-generation-and-roughness-exraction) | ||
2. [Ray Tracing](#ray-tracing): The actual ray tracing stage. | ||
3. [Denoising](#denoising): Denoising the image obtained during the ray tracing stage. | ||
- [Spatial reconstruction](#spatial-reconsturction) | ||
- [Temporal accumulation](#temporal-accumulation) | ||
- [Cross-bilateral filtering](#cross-bilateral-filtering) | ||
|
||
#### Blue noise texture generation | ||
|
||
Blue noise sampling | | ||
:----------------------------:| | ||
 | | ||
|
||
|
||
[**ComputeBlueNoiseTexture.fx**](https://github.com/DiligentGraphics/DiligentFX/blob/master/Shaders/Common/private/ComputeBlueNoiseTexture.fx) | ||
|
||
#### Hierarchical depth generation | ||
The hierarchical depth buffer is a mip chain where each mip level pixel is the minimum(maximum for reseved depth) of the previous level’s 2×2 area depths (mip 0 corresponds to the screen-sized, original depth buffer). It will used later to speed up raymarching but can also used in many other techniques, like GPU occlusion culling. | ||
|
||
Depth mip chain | | ||
:-----------------------------------:| | ||
 | | ||
|
||
We recommend reading this article **[Mike Turitzin, Hi-Z]**, because computing a hierarchical buffer for resolutions not divisible by 2 is not so trivial. The original AMD algorithm uses SPD **[AMD-SPD]** to convolve a depth buffer. SPD allows us to compute it for a single **Dispatch** call, since we can't use compute shaders we use a straightforward approach. | ||
We calculate each mip level using a pixel shader [**ComputeHierchicalDepthBuffer.fx**](https://github.com/DiligentGraphics/DiligentFX/blob/master/Shaders/PostProcess/ScreenSpaceReflection/private/ComputeHierarchicalDepthBuffer.fx), using the previous mip level as an input. | ||
|
||
#### Stencil mask generation and roughness extraction | ||
The original algorithm work starts with a classification pass (**ClassifyTiles**). This step writes pixels will participate in [ray tracing step](#ray-tracing) and subsequent [denoising stages](#denoising) to a global buffer(Our denoiser differs from the AMD implementation, but the underlying idea remains the same). | ||
The decision of whether a pixel needs a ray or not is based on the roughness; very rough surfaces don't get any rays and instead rely on the prefiltered environment map as an approximation. | ||
Once this done we are (almost) ready to to ray march, the only problem is that we don’t know the size of the global array of pixels to trace on the CPU to launch a Dispatch. For that reason, the technique fills a buffer with indirect arguments, with data already known to the GPU and uses a `DispatchIndirect` instead. The indirect arguments buffer is populated during the **PrepareIndirectArgs** pass. Nothing particular to mention here apart from that it adds 2 entries to the indirect buffer, one for the pixels to trace and one for the tiles to denoise later. | ||
|
||
Unfortunately, we cannot use compute shaders and dispatch indirect, as I mentioned earlier, so we have made compromises. We use a stencil mask to mark pixels that should participate in subsequent calculations. To do this, we first clear the values of all pixels in the stencil buffer to `0x0`, and then we run [**ComputeStencilMaskAndExtractRougness.fx**](https://github.com/DiligentGraphics/DiligentFX/blob/master/Shaders/PostProcess/ScreenSpaceReflection/private/ComputeStencilMaskAndExtractRoughness.fx) with stencil test enabled for writing. If the roughness of the current pixel is less than `RoughnessThreshold`, | ||
we write the value `0xFF` to the stencil buffer; otherwise, the stencil buffer retains its previous value of `0x00`. In subsequent steps, we enable stencil test for reading with the `COMPARISON_FUNC_EQUAL` function for the value `0xFF`. Simultaneously with writing the stencil test, we write the roughness to a separate render target. The separate texture allows us to simplify the code for roughness sampling in subsequent steps of the algorithm and theoretically should slightly improve performance. | ||
|
||
Stencil Mask for SSR | Final renderend image with SSR | ||
:---------------------------:|:-------------------------: | ||
 |  | ||
|
||
|
||
### Ray tracing | ||
|
||
Spacular part of rendering equation | | ||
:----------------------------------------:| | ||
 | | ||
|
||
|
||
Split sum approximation | | ||
:----------------------------------------:| | ||
| | ||
|
||
|
||
Invalid ray | | ||
:----------------------------------------:| | ||
 | | ||
|
||
|
||
Hiearchical depth buffer traversal | | ||
:----------------------------------------:| | ||
 | | ||
|
||
### Denoising | ||
#### Spatial reconstruction | ||
#### Temporal accumulation | ||
#### Cross-bilateral filtering | ||
## Possible improvements | ||
* Add support for reserved depth buffer | ||
* Add support for compressed normal map | ||
* Use the previous frame as input radiance to the [ray tracing stage](#ray-tracing). This will add multiple reflection and also simplify render architecture | ||
* Add dynamic resolution for the raytracing stage, which will increase performance on weaker GPU | ||
* [Spatial reconstruction step](#spatial-reconsturction) uses screen space to accumulate samples. Try to perform accumulation in world coords, this should reduce bias | ||
* We can also try calculating direct specular occlussion in the [ray tracing step](#ray-tracing) | ||
* The bilateral filter does not have the separability property, so it will have poor performance on large kernel dimensions. Consider replacing by [Guided Image Filtering](#https://kaiminghe.github.io/eccv10/index.html since this algorithm has this property | ||
* Current implementation of hierarchical ray marching has a [problem](https://youtu.be/MlTohmB4Gh4?t=762). We have to try to fix it | ||
|
||
## References | ||
|
||
- **[AMD-SSSR]**: FidelityFX Stochastic Screen-Space Reflections 1.4 - https://gpuopen.com/manuals/fidelityfx_sdk/fidelityfx_sdk-page_techniques_stochastic-screen-space-reflections/ | ||
- **[AMD-SPD]**: FidelityFX Single Pass Downsampler - | ||
https://gpuopen.com/manuals/fidelityfx_sdk/fidelityfx_sdk-page_techniques_single-pass-downsampler/ | ||
- **[EA-SSRR]** Frostbite presentations on Stochastic Screen Space Reflections - https://www.ea.com/frostbite/news/stochastic-screen-space-reflections | ||
- **[EA-HYRTR]** EA Seed presentation on Hybrid Real-Time Rendering - https://www.ea.com/seed/news/seed-dd18-presentation-slides-raytracing | ||
- **[Eric Heitz, VNDF]** Eric Heitz' paper on VNDF - http://jcgt.org/published/0007/04/01/ | ||
- **[Eric Heitz, Blue Noise]** Eric Heitz' paper on Blue Noise sampling - https://eheitzresearch.wordpress.com/762-2/ | ||
- **[Kostas Anagnostou, SSSR]** Notes on Screen-Space Reflections with FidelityFX SSSR - | ||
https://interplayoflight.wordpress.com/2022/09/28/notes-on-screenspace-reflections-with-fidelityfx-sssr/ | ||
- **[Thorsten Thormählen, IBL]** Graphics Programming Image-based Lighting - | ||
https://www.mathematik.uni-marburg.de/~thormae/lectures/graphics1/graphics_10_2_eng_web.html#1 | ||
- **[Mike Turitzin, Hi-Z]** Hierarchical Depth Buffers - https://miketuritzin.com/post/hierarchical-depth-buffers/ |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+501 KB
PostProcess/ScreenSpaceReflection/media/ssr-hierarchical-traversal.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+47.8 KB
PostProcess/ScreenSpaceReflection/media/ssr-split-sum-approximation.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.