Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project5: Zhaojin Sun #17

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
562 changes: 562 additions & 0 deletions .gitignore

Large diffs are not rendered by default.

44 changes: 39 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,44 @@ Vulkan Grass Rendering

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Zhaojin Sun
* www.linkedin.com/in/zjsun
* Tested on: Windows 11, i9-13900HX @ 2.2GHz 64GB, RTX 4090 Laptop 16GB

### (TODO: Your README)
### Demo GIF
The following GIF demonstrates the result of grass rendering with all culling techniques applied. To make the effects more noticeable, I’ve increased the forces acting on the grass blades. The normal of each grass blade is calculated according to the method provided in the paper, but because the shading model is relatively simple, the actual effect looks a bit peculiar.
![demo.gif](img%2Fdemo.gif)

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
### 1. Project Overview
This project uses Vulkan to render a grass model, with the specific algorithm primarily based on this paper: [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). This is my first time working with Vulkan. As an API that bridges the gap between graphics engines and game engines, Vulkan exposes many low-level operations to users, which has allowed me to learn many new concepts, such as tessellation operations, while programming. Overall, although it has been challenging, it has also been very rewarding.

**Features implemented**
- Vulkan Grass Rendering Pipeline
- Grass Blades from Bezier Curve
- Grass Blades Force Simulation
- Grass Blades Culling
- Grass Blades Tessellation


### 2. Features and Performance Analysis
#### (i) Tessellation and Rendering
The following image shows the shape of the grass blades without any external forces applied. The tessellation levels for both the inner and outer contours of the grass blades are set to 10, so the grass blades appear relatively smooth.
![blades.png](img%2Fblades.png)

#### (ii) Force Simulation
The following GIF shows the rendering effect with applied forces but without any culling. Regarding wind direction, I modified the fixed wind direction in the original paper to a slowly changing direction over a larger period, making the grass effect look more realistic.
![no_culling.gif](img%2Fno_culling.gif)

#### (iii) Culling Tests
The first GIF below shows the effect of orientation culling, and the second GIF shows the effect of distance culling. Since view-frustum culling removes grass blades outside the field of view, it’s not possible to display it here. For orientation culling, I initially tried using the third column of the view matrix as the camera direction, but the result was incorrect because the view matrix is in the camera coordinate system, not the world coordinate system. After numerous attempts, I found that using the direction from the camera origin to v0 as the camera direction produced the correct effect.
![no_culling.gif](img%2Fori_culling.gif)
![no_culling.gif](img%2Fdist_culling.gif)


#### (iv) Performance Analysis
Performance analysis for grass rendering is challenging because culling largely depends on the current camera position. To observe the effects of certain culling techniques, the camera needs to be moved, but it’s difficult to ensure that each movement angle and distance is the same. Therefore, the test on how the number of grass blades impacts performance is conducted without any culling applied, as shown in the image below.
![blade_number.png](img%2Fblade_number.png)
As we can see, starting from 2^13, the GPU has already reached thread saturation, and the growth rate becomes approximately linear with the increase in the number of grass blades. When the number of grass blades is very low, the FPS estimate isn’t very accurate, but even with just one blade, the maximum FPS is around 10000. This indicates that when the GPU is not yet saturated, it can render sparse grass extremely quickly.

The following image reflects the performance improvement from culling. Interestingly, although it generally corresponds to the linear relationship shown in the previous graph, the combined effect of the three culling techniques yields a result where 1+1+1 > 3. This may be due to the compounded reduction in workload when multiple culling methods work together. Since a tolerance was set for the view-frustum, view-frustum culling only takes effect when the camera is very close to the grass, but this effect is quite significant.
![culling.png](img%2Fculling.png)
Binary file modified bin/Release/vulkan_grass_rendering.exe
Binary file not shown.
Binary file added img/blade_number.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/blades.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/culling.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/demo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/dist_culling.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/no_culling.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/ori_culling.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added performace.xlsx
Binary file not shown.
3 changes: 1 addition & 2 deletions src/Blades.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode

for (int i = 0; i < NUM_BLADES; i++) {
Blade currentBlade = Blade();

glm::vec3 bladeUp(0.0f, 1.0f, 0.0f);

// Generate positions and direction (v0)
Expand Down Expand Up @@ -45,7 +44,7 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode
indirectDraw.firstInstance = 0;

BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory);
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory);
}

Expand Down
133 changes: 129 additions & 4 deletions src/Renderer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,38 @@ void Renderer::CreateComputeDescriptorSetLayout() {
// TODO: Create the descriptor set layout for the compute pipeline
// Remember this is like a class definition stating why types of information
// will be stored at each binding

VkDescriptorSetLayoutBinding bladeDataBinding = {};
bladeDataBinding.binding = 0;
bladeDataBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
bladeDataBinding.descriptorCount = 1;
bladeDataBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
bladeDataBinding.pImmutableSamplers = nullptr;

VkDescriptorSetLayoutBinding culledBladeDataBinding = {};
culledBladeDataBinding.binding = 1;
culledBladeDataBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
culledBladeDataBinding.descriptorCount = 1;
culledBladeDataBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
culledBladeDataBinding.pImmutableSamplers = nullptr;

VkDescriptorSetLayoutBinding numBladesBinding = {};
numBladesBinding.binding = 2;
numBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
numBladesBinding.descriptorCount = 1;
numBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
numBladesBinding.pImmutableSamplers = nullptr;

std::vector<VkDescriptorSetLayoutBinding> bindings = { bladeDataBinding, culledBladeDataBinding, numBladesBinding };

VkDescriptorSetLayoutCreateInfo layoutInfo = {};
layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
layoutInfo.bindingCount = static_cast<uint32_t>(bindings.size());
layoutInfo.pBindings = bindings.data();

if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) {
throw std::runtime_error("Failed to create compute descriptor set layout");
}
}

void Renderer::CreateDescriptorPool() {
Expand All @@ -216,6 +248,7 @@ void Renderer::CreateDescriptorPool() {
{ VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 },

// TODO: Add any additional types and counts of descriptors you will need to allocate
{ VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, (uint32_t)(3 * scene->GetBlades().size())}
};

VkDescriptorPoolCreateInfo poolInfo = {};
Expand Down Expand Up @@ -320,6 +353,37 @@ void Renderer::CreateModelDescriptorSets() {
void Renderer::CreateGrassDescriptorSets() {
// TODO: Create Descriptor sets for the grass.
// This should involve creating descriptor sets which point to the model matrix of each group of grass blades
grassDescriptorSets.resize(scene->GetBlades().size());

VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(grassDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate grass descriptor sets");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(grassDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo modelBufferInfo = {};
modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer();
modelBufferInfo.offset = 0;
modelBufferInfo.range = sizeof(ModelBufferObject);

descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[i].dstSet = grassDescriptorSets[i];
descriptorWrites[i].dstBinding = 0;
descriptorWrites[i].dstArrayElement = 0;
descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
descriptorWrites[i].descriptorCount = 1;
descriptorWrites[i].pBufferInfo = &modelBufferInfo;
}

vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateTimeDescriptorSet() {
Expand Down Expand Up @@ -360,6 +424,60 @@ void Renderer::CreateTimeDescriptorSet() {
void Renderer::CreateComputeDescriptorSets() {
// TODO: Create Descriptor sets for the compute pipeline
// The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades
computeDescriptorSets.resize(scene->GetBlades().size());

VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(computeDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate compute descriptor sets");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(3 * computeDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo bladeBufferInfo = {};
bladeBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer();
bladeBufferInfo.offset = 0;
bladeBufferInfo.range = VK_WHOLE_SIZE;

VkDescriptorBufferInfo culledBufferInfo = {};
culledBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer();
culledBufferInfo.offset = 0;
culledBufferInfo.range = VK_WHOLE_SIZE;

VkDescriptorBufferInfo numBladesBufferInfo = {};
numBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer();
numBladesBufferInfo.offset = 0;
numBladesBufferInfo.range = sizeof(BladeDrawIndirect);

descriptorWrites[3 * i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i].dstBinding = 0;
descriptorWrites[3 * i].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i].descriptorCount = 1;
descriptorWrites[3 * i].pBufferInfo = &bladeBufferInfo;

descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 1].dstBinding = 1;
descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 1].descriptorCount = 1;
descriptorWrites[3 * i + 1].pBufferInfo = &culledBufferInfo;

descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 2].dstBinding = 2;
descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 2].descriptorCount = 1;
descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBufferInfo;
}

vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateGraphicsPipeline() {
Expand Down Expand Up @@ -480,7 +598,7 @@ void Renderer::CreateGraphicsPipeline() {
colorBlending.blendConstants[2] = 0.0f;
colorBlending.blendConstants[3] = 0.0f;

std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, modelDescriptorSetLayout };
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, modelDescriptorSetLayout, computeDescriptorSetLayout };

// Pipeline layout: used to specify uniform values
VkPipelineLayoutCreateInfo pipelineLayoutInfo = {};
Expand Down Expand Up @@ -717,7 +835,7 @@ void Renderer::CreateComputePipeline() {
computeShaderStageInfo.pName = "main";

// TODO: Add the compute dsecriptor set layout you create to this list
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout };
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout };

// Create pipeline layout
VkPipelineLayoutCreateInfo pipelineLayoutInfo = {};
Expand Down Expand Up @@ -884,6 +1002,11 @@ void Renderer::RecordComputeCommandBuffer() {
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr);

// TODO: For each group of blades bind its descriptor set and dispatch
for (uint32_t i = 0; i < computeDescriptorSets.size(); ++i) {
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr);
vkCmdDispatch(computeCommandBuffer, (NUM_BLADES + WORKGROUP_SIZE - 1) / WORKGROUP_SIZE, 1, 1);
}


// ~ End recording ~
if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) {
Expand Down Expand Up @@ -976,13 +1099,14 @@ void Renderer::RecordCommandBuffers() {
VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() };
VkDeviceSize offsets[] = { 0 };
// TODO: Uncomment this when the buffers are populated
// vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);
vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);

// TODO: Bind the descriptor set for each grass blades model
vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr);

// Draw
// TODO: Uncomment this when the buffers are populated
// vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
}

// End render pass
Expand Down Expand Up @@ -1057,6 +1181,7 @@ Renderer::~Renderer() {
vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr);

vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr);

Expand Down
3 changes: 3 additions & 0 deletions src/Renderer.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,14 @@ class Renderer {
VkDescriptorSetLayout cameraDescriptorSetLayout;
VkDescriptorSetLayout modelDescriptorSetLayout;
VkDescriptorSetLayout timeDescriptorSetLayout;
VkDescriptorSetLayout computeDescriptorSetLayout;

VkDescriptorPool descriptorPool;

VkDescriptorSet cameraDescriptorSet;
std::vector<VkDescriptorSet> modelDescriptorSets;
std::vector<VkDescriptorSet> grassDescriptorSets;
std::vector<VkDescriptorSet> computeDescriptorSets;
VkDescriptorSet timeDescriptorSet;

VkPipelineLayout graphicsPipelineLayout;
Expand Down
16 changes: 16 additions & 0 deletions src/Scene.cpp
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
#include "Scene.h"
#include "BufferUtils.h"
#include <iostream>

Scene::Scene(Device* device) : device(device) {
BufferUtils::CreateBuffer(device, sizeof(Time), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, timeBuffer, timeBufferMemory);
vkMapMemory(device->GetVkDevice(), timeBufferMemory, 0, sizeof(Time), 0, &mappedData);
memcpy(mappedData, &time, sizeof(Time));

// FPS count
frameCount = 0;
fps = 0.0f;
lastFpsTime = std::chrono::high_resolution_clock::now();
}

const std::vector<Model*>& Scene::GetModels() const {
Expand Down Expand Up @@ -32,6 +38,16 @@ void Scene::UpdateTime() {
time.totalTime += time.deltaTime;

memcpy(mappedData, &time, sizeof(Time));

frameCount++;
duration<float> elapsed = duration_cast<duration<float>>(currentTime - lastFpsTime);
if (elapsed.count() >= 1.0f) {
fps = frameCount / elapsed.count();
std::cout << "FPS: " << fps << std::endl;

frameCount = 0;
lastFpsTime = currentTime;
}
}

VkBuffer Scene::GetTimeBuffer() const {
Expand Down
5 changes: 5 additions & 0 deletions src/Scene.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,9 @@ high_resolution_clock::time_point startTime = high_resolution_clock::now();
VkBuffer GetTimeBuffer() const;

void UpdateTime();

// FPS count
long frameCount;
float fps = 0.0f;
high_resolution_clock::time_point lastFpsTime;
};
Loading