Minimum Spanning Tree - CUDA

This repository is an implementation of Minimum Spanning Tree (MST) with parallelization using CUDA.

Running the Program

To run the program, you need to have CUDA Setup first, you can see the details here. After that, you basically need to compile and make an executable file of the program. So, you can move to the src folder first, then you can type in the command prompt:

nvcc -o <program_name> MST_CUDA.c && <program_name>.exe

Approach on Getting Minimum Spanning Tree with Parallelization

To get the minimum spanning tree on a particular graph, we use Kruskal Algorithm, which is to use an edge list with weight included, and then sort the edge list ascendingly by its weight, then add every edge from left to right in that list, without making any cycle (handled by using Disjoint Set Union). So for the sort part we use the parallelization with CUDA, here we use bitonic sort.

Bitonic sort is a comparison-based sorting algorithm that can be run in parallel. It focuses on converting a random sequence of numbers into a bitonic sequence, one that monotonically increases, then decreases. Rotations of a bitonic sequence are also bitonic. You can read more about this sorting algorithm here.

We will focus on how we use CUDA for bitonic sort parallelization. Our parallel solution is to first make the length to the nearest power of $2$, and fill in the gap with large numbers (INT_MAX), this is because our bitonic sort implementation doesn’t use recursive and therefore needs a length with a power of $2$. And then, we use cudaMalloc and cudaMemcpy to copy the data to the GPU. We then define the number of threads (num_threads) with value equal to min(length, 512) with length being the length of the data array (after the addition of the dummy element). We also define the number of blocks (num_blocks) with value equal to the length / num_thread. The device function bitonic_sort_kernel is used for comparing and swapping unordered elements according to bitonic rules (there are increasing and decreasing orders). Last, we copy the sorted data array back from the device to the host.

To see more detail, you can look at the implementation code MST_CUDA.cu.

Speedup Program Analyzation

In this section we will compare the speed between serial and parallel program, we use google collab as the environment to run the program. Here are the results:

nodes	serial	parallel	speedup
$100$	$0.009601000000 \text{ ms}$	$0.009177000000 \text{ ms}$	$1.046$
$500$	$0.270724000000 \text{ ms}$	$0.235964000000 \text{ ms}$	$1.147$
$1000$	$1.247940000000 \text{ ms}$	$1.060560000000 \text{ ms}$	$1.176$
$3000$	$22.257306000000 \text{ ms}$	$10.049784000000 \text{ ms}$	$2.215$

From the table above, we can see that the parallel program is generally faster than the serial program. The speedup is bigger when we use a larger dataset. That is because the parallel program has an overhead time to create, manage, and delete the thread. So, for bigger datasets the overhead time can be neglected.

Serial Program Pastebin

Because the serial program is not required to be included in the repository, we attach the serial program via pastebin here.

Authors

Muhammad Hasan (13518102) - muhammadhasan01
Naufal Dean Anugrah (13518123) - naufal-dean

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minimum Spanning Tree - CUDA

Running the Program

Approach on Getting Minimum Spanning Tree with Parallelization

Speedup Program Analyzation

Serial Program Pastebin

Authors

About

Releases

Packages

Contributors 2

Languages

muhammadhasan01/mst-cuda

Folders and files

Latest commit

History

Repository files navigation

Minimum Spanning Tree - CUDA

Running the Program

Approach on Getting Minimum Spanning Tree with Parallelization

Speedup Program Analyzation

Serial Program Pastebin

Authors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages