Introduction
Build instructions
Rasterization
Features
Camera and Object transformations
Object transformations using 4x4 homogeneous matrices
Orthographic and Perspective camera
Phong and Blinn-Phong shading
Phong and Blinn-Phong shading with textures
Normal mapping
Affine and Perspective corrected mapping for textures
Simple optimizations
Depth buffering
Rendering paths
Shadow mapping
Gallery
Cool images
External dependencies
Future work
References
In order to be able to understand how rendering work I decided to implement a forward/deferred renderer (based on my experience with OpenGL) in the CPU. The goal of this project is not to create a next generation renderer or an efficient CPU renderer. This project aims to understand how the rendering algorithms transforms a set of vertices that make up a 3D World into a 2D image of that World. I tried to be clear in my code to make it readable and easy to understand.
I've implement some basic features that I consider relevant for any graphics programmer to understand:
- Camera and Object transformations using 4x4 homogeneous matrices
- Rotations using Euler angles and Quaternions
- Affine and Perspective corrected mapping for textures
- Orthographic and Perspective camera
- Phong and Blinn-Phong shading given material phong coefficients
- Phong and Blinn-Phong shading given material diffuse and specular textures
- Normal mapping
- Simple optimizations
- A depth-buffer to solve the visibility surface problem
- Two rendering paths: Forward and deferred
- Shadow mapping for directional lights with PCF
This project has been developed using VS 2017 but it should work on Windows, MAC OS and Linux. It can be easily build using CMake and different generators. The following code can be used to generate the VS 2017 project files:
// VS 2017
cd <project-folder>
mkdir build
cd build/
cmake .. "Visual Studio 15 2017 Win64"
Then, from VS you can edit and execute the project. Make sure that main project is set up as the startup project
If you are using Linux, you need to change the generator (use the default) and execute an extra operation to actually make the executable:
// Linux
cd <project-folder>
mkdir build
cd build/
cmake ..
make
The CMake configuration file CMakeLists.txt contains the definition of multiple preprocessor directives:
- Renderer:
- D_FORWARD: Enables forward rendering
- D_DEFERRED: Enables deferred rendering
- Shading:
- D_FLAT: Enables Flat Shading (using base colors). Note that scene must be set to _SCENE_FLAT.
- D_PHONG: Enable Phong Shading
- D_BLINN_PHONG: Enables Blinn-Phong Shading
- Shadows:
- D_PCF_SHADOWS: Enables Percentage Closer Filtering on shadows
- Camera:
- D_ORTHOGRAPHIC: Uses a Orthographic camera as main camera
- D_PERSPECTIVE: Uses a Perspective camera as main camera
- Texture mapping:
- D_AFFINE_TEXTURES: Uses affine texture mapping (fast but low quality)
- D_PERSPECTIVE_TEXTURES: Uses perspective texture mapping
- Scene:
- D_SCENE_FLAT: Renders a scene using flat objects (without textures, only base color)
- D_SCENE_TEXTURED: Renders a scene using textured objects (diffuse, specular and normal textures)
*Note: The CMake build also builds all external dependencies listed in the section External dependencies
Checkout my blog posts about rasterization to understand step by step how it works.
I've implemented the most basic transformations: rotations and translations. To do so, I've used 4x4 homogeneous matrices. Homogeneous matrices are very useful to perform affine transformations because they can represent a linear transformation (rotation, scale, skew...) and a translation in a single matrix.
To implement rotations and translation, I've added to each object a Model matrix that stores the local transformations for each object. Due to the multiplication properties of this matrices, multiple transformations can be concatenated.
The model transform is applied to each of the vertices when the rasterizer asks for the Geometry. It is performed only once and is the first transformation performed that converts the vertices from Object Local Space to World Space.
Translation can be performed easily by modifying only the translation part of the 4x4 matrix
blueCube->translate(Vector3D(250, 0, 0)); // Move 250 units in the positive X direction
I've implemented rotations using Euler angles because it's the most intuitive form of rotation. In the following image each cube is rotated 45º in roll, pitch or yaw; and translated a small amount in the X axis:
redCube->rotate_euler(45, 0, 0); // 45º in roll
redCube->translate(Vector3D(-250, 0, 0)); // Slighlty to the left
greenCube->rotate_euler(0, 45, 0); // 45º in pitch
blueCube->rotate_euler(0, 0, 45); // 45º in yaw
blueCube->translate(Vector3D(250, 0, 0)); // Slightly to the right
On the other hand, in the next image, the cube is rotated in all three euler angles: roll, pitch and yaw; and also translated:
float roll = 45;
float pitch = 15;
float yaw = 30;
object->rotate_euler(roll, pitch, yaw); // Rotate using euler angles 45º roll, 15º pitch, 30º yaw
object->translate(Vector3D(10, 50, 20)); // Translate 10 units in X direction, 50 in Y and 20 in Z
Nevertheless, when the angle of rotation is near +- 90 degrees, Euler angles suffer of something called Gimbal Lock. To avoid this I've implemented quaternion-based rotations.
Quaternions are less intuitive than Euler angles. To make it easier to work with them, I implemented a rotate method that takes an Axis-Angle representation of a rotation. This representation can be later converted to the Quaternion form easily.
redCube->rotate_quaternion(45, Vector3D(0, 1, 0)); // Equivalent to 45º in pitch rotation
I implemented the two most common camera modes:
- Ortographic camera: produces an orthograpic projection using parallel projectors. This means that all projection lines are orthogonal to the projection plane and parallel lines remains (there is no foreshortening). As we can see in the next image, all cubes have the same size (even that some of them are further than others) and parallel lines remains:
- Perspectice camera: produces a perspective projections using projectors to a center of projection or focal point. There's also a scaling factor that produces the foreshortening: closer objects seem bigger and further ones seem smaller. Parallel lines intersect.
I'm not going to talk about how Phong or Blinn-Phong works because you can find it in blog.
In the next image:
- The Red cube is using Flat shading. This is the simple flat color of the object without taking into account lights.
- The Green cube is using Phong shading. We can see a huge difference with the flat shading because now we are able to see edges and this gives us this 3D feeling
- The Blue cube is using Blinn-Phong shading. The difference between Phong and Blinn-Phong is very subtle and is only really noticeable when the angle between the View direction and the Reflected vector is greater than 90º
Using textures is a simple modification in the program but it produces a huge quality increasement. The idea is to replace is basic color that we got previously by a color that we retrieve from a texture using Texture Coordinates. Texture coordinates are specifider per vertex (exaclty as colors are) and then, to get the color of a specific point in the triangle, an interpolation is performed across the vertices.
In my code I've used textures for the diffuse and specular shading in the Phong and Blinn-Phong shading models. In the next image, all cubes use Blinn-Phong but:
- The red cube uses a flat color.
- The Box in the middle uses a diffuse texture for the color calculation.
- The Box on the right uses a diffuse and a specular texture for the color calculation. That makes the metalic borders of the box shinier. This looks much more realistic that the previous one.
The idea behind normal mapping is very similar to texture mapping but with a few differences. In texture mapping we used the texture to retrieve a color. In normal mapping, we use a texture to specify the direction of the normal vector instead. This provides a fine grained detail that produces realistic materials. The implementation is a little bit more complicated than diffuse/specular mapping because operations must be performed in something called Tangent Space before getting the normal vector in World Space.
In the following image we can see:
- The Box on the left uses Blinn-Phong shading with diffuse and specular textures (as before). Quite good uh?
- The Box on the right uses Blinn-Phong shading with diffuse and specular textures BUT it also uses normal maps. It is very easy to see the differences. This box is more realistic, gives a sense of depth in the cracks and has some roughness. It doesn't look like a completely flat texture.
As I said before, when mapping textures we have to calculate an interpolation from the values of the vertices of the triangle. There are different ways to do it. I've implemented two:
- Affine mapping. Is the cheapest way to perform texture mapping. However, it can produce wrong results if there is perspective distorsion (due to foreshortening). In the next image, the first texture is mapped correctly because is flat: the distance to the camera is constant in all points of the plane so there is no perspective distorsion. However, the second texture is mapped incorrectly:
- Perspective corrected mapping. Solves the previous problem taking into account the distance of the triangle to the camera into the mapping formula producing better results:
I've implemented some very common and simple optimizations to speed up the rendering process:
- Bounding box allow us to narrow the amount of pixels of the viewport that we have to iterate in order to color them. It is as simple as calculating the Bounding Box of the triangle in raster space and iterate that instead the whole screen. In the following image, in order to color the red triangle we iterated all the pixels inside the grey bounding box:
- View frustrum culling allows us to avoid rendering some parts of the scene that we know for sure that are not inside the view frustrum so we can discard them beforehand.
When in a scene we have multiple objects overlapping in one single pixel, in order to calculate the color of the pixel, we must find which object is closer to the camera. The best way to do it is using a depth buffer. Very briefly, a depth-buffer is just a buffer that has the same size of the image and stores the distance between the camera and the object that has been redered last. This way, everytime we render we can compare the distance of the current object with the previous one and only render if the distance is smaller.
The next image is a render without depth-buffering. As you can see, there is an area where the two triangles overlap. Here, the decision of which triangle is shown is arbitrary. In this case, the blue triangle is further but it is displayed first.
On the other hand, the next image uses depth-buffering and thus, the red triangle is displayed first because it is closer to the camera.
If we take the depth buffer and we export it as a texture, it looks like in the next image. Black means that the object is close to the camera and white means that it is far. As we can see, the depth-color of the red triangle is darker than the blue triangle and thus it is closer and should be displayed first.
The next image is a render without depth-buffering. As you can see, there is an area where the two triangles overlap. Here, the decision of which triangle is shown is arbitrary. In this case, the blue triangle is further but it is displayed first.
To render an image we can use RayTracing or Rasterization. In the latter, depending on the order of how operations are performed we can talk about Forward Rendering or Deferred Rendering. The result is always the same, the only thing that changes is the order of the operations.
To read more about this, checkout the references, there is a very good article about forward and defered rendering
This is the most common way of rendering and when we simply talk about rendering people usually mean forward rendering. The idea is that you set up the data in the GPU, project into the screen and split it into fragments or pixels and then you calculate the final color of each pixel. As you may guess, this process is called forward rendering because is linear and always goes forward.
This is the a complete scene rendered using forward rendering
And this is the resulting depth buffer
As opposed to forward rendering it exists deferred rendering. First of all, let's analyze what's the problem with Forward Rendering. The problem is that a pixel that has been shaded previously can be shaded again if an object that is closer to the camera is found. Shading is usually an expensive process because it may take into account multiple lights, indirect lighting, direct lighting, shadows and so on. For this reason, this is a waste of resources because we're shading each pixel many times since the order of how the geometry is evaluated is arbitrary.
Deferred rendering solves this problem. The idea is that shading/lighting calculations are deferred until the end. First of all, the visible surface is determined and only when this process has finished, the lighting calculations are being performed. This is very smart because shading is performed only when we know the geometry and this allows to do pixel shading is exactly once. This produces notably speed up in time execution but also increases the memory needed to render a scene. This happens because, along with the visible surface, we have to store also the other the properties used for lighting calculations: colors, specularity, normals and so on. Storing this may be quite expensive but it is usually worth.
To generate the same scene as before but using a deferred render we had to use additional buffers along with the depth buffer:
- Depth buffer: Same as before
- Diffuse buffer: Store diffuse colors of the geometry
- Specular buffer: Store specular colors of the geometry
- Normal buffer: Store normals vectors of the geometry
Then, to calculate shading we just have to combine all these different buffers and perform the calculation.
As a side note, there is another interesting rendering algorithm usually used in hand-held devices called TBDF (Tile Based Deferred Rendering) that uses deferred rendering but, in order to reduce the memory needed, it splits up the screen in smaller tiles having the benefits of deferred rendering without the excessive memory usage
In order to add shadows to my scene I've implemented a technique called Shadow Mapping. It consists of three parts:
- Render the depth of the scene from the Light point of view. We store in each pixel the distance from the light to the closest object that it sees. This is called a Shadow Map.
- We render the image from the camera as always but now we have to check if a pixel is in shadow. To do so, weconvert this pixel into World Space and then into Light Space.
- Finally, because the pixel is in Light Space we can retrieve it's depth from the Light's POV. Then, we compare this depth with the distance from that point (in world space) to the light. If it is smaller, it means that there is some object in between and thus, the pixel is in shadow.
The following image is the Shadow Map of the previous scene:
Testing the depth of this shadow map and the distance from each point to the light sources we can create shadows:
And moving the camera a behind to have a different point of view:
If you want to know more, this is by far best tutorial that I've found about shadow mapping
PCF (Percentage Closer Filtering) help us to create soft shadows and making them less "blocky". The idea is, instead of taking only one sample from the shadow map, we take many of them and we calculate the average. In my case I took nine samples: the center point and neighbors. The result has a much better quality:
These are just some samples.
It all began here with the first render:
And it end up here:
Moving the camera behind and the light slighlty upwards:
And here using textures:
Triangle render with RGB color interpolation using barycentric coordinates
The same principle as before but applied to a Cube to create a color palette
Checkout the rest of the images in the repository at docs/images/gallery.
I've found interesting to take pictures when things go terribly wrong for two reasons: First, because after you solved the problem, you can understand why the image looked like that. Second, because some images are pretty cool. These are just some of the terribly wrong but cool images
This is my favorite one: I was trying to draw a square (two triangles) with a texture of black and white squares. However, the indices of the triangles were not corrected and I ended up with something like this:
Checkout the rest of cool images at docs/images/gallery/cool_but_wrong.
- Solve Peter Panning effect due to shadow bias in shadow mapping
- Point lights with omnidirectional shadows (using cubemaps) and attenuation
- Alpha blending
- Sky boxes
- Model loading using .OBJ files
- Render to a window using some external library to achieve real time rendering (SFML)
This project uses only two external dependencies:
- EasyBMP to work with images (read/write diffuse, specular, normal textures...)
- GLM to easily work with matrices
All these libraries are included under the lib/ directory and are build with CMake.
Fletcher Dunn, Ian Parberry: “3D Math Primer for Graphics and Game Development"
Scratchapixel, Rasterization: Practical implementation
Brent Owens, Forward Rendering vs. Deferred Rendering