Homeworks and final project of Scalable High Performance Computing (M1522.006700) lecture, at 2023 Autumn.
Gradually increases performance of matrix-multiplication code, using AVX-512, OpenMP, MPI, OpenCL, and CUDA.
Every codes were tested on a server cluster provided by school, with a total of 12 nodes with four RTX TITANs each.
- HW1: Digging into Compilation Process, Tutorial on Slurm
- HW2: Calculating Theoretical Peak Performance, Matmul Using Pthread
- HW3: Investigating Cache Specifications, Matmul Using OpenMP
-
HW4: Computing
$\pi$ with MPI, Matmul with MPI - HW5: Investigating GPU Specifications, Matmul with OpenCL
- HW6: Matmul Challenge with CUDA
- Final Project: Accelerating Text Classifier Model
- etc: OpenMP Trials