Name	Name	Last commit message	Last commit date
parent directory ..
part1	part1
part2	part2
.gitignore	.gitignore
README.md	README.md

Advanced GPU Computing course

This folder contains the material for an advanced GPU computing course taught in 2023 at the Swiss National Supercomputing Centre (CSCS), ETH Zurich.

Part 1

The first part of the course, taught by Tim Besard (JuliaHub), focusses on (advanced) usage of CUDA.jl and how to analyze and optimize GPU applications written in Julia. It covers:

Advanced usage of CUDA.jl
- library integrations and wrappers (CUDA driver API, CUBLAS, etc)
- programming models (array abstractions, kernels)
- memory management
- task-based concurrent GPU computing
Performance deep-dive
- application analysis and optimization (using NSight Systems)
- kernel analysis and optimization (using NSight Compute)

A YouTube recording is available, with the following key timestamps:

00:00: Introduction to the course
03:23: Introduction to part 1
04:59: Presentation of notebook 1-0: Introduction
24:19: Presentation of notebook 1-1: Array programming
43:18: Presentation of notebook 1-2: Application analysis and optimization
1:33:22: Presentation of notebook 1-3: Kernel programming
2:25:23: Presentation of notebook 1-4: Kernel analysis and optimization
3:19:16: Presentation of notebook 2-1: CUDA libraries
3:41:08: Presentation of notebook 2-2: Memory management
4:03:44: Presentation of notebook 2-3: Concurrent computing

Part 2

The second part of the course, taught by Samuel Omlin (CSCS) deals with more concrete examples that matter to the HPC community. A YouTube recording is available too, with the following key timestamps:

00:51: High-speed introduction/thoughts on GPU supercomputing
08:38: Overview on course notebooks of part 1
11:08: Presentation of notebook 1: Memory copy and performance evaluation
43:59: Walk through solutions of notebook 2: Application performance evaluation and optimization
58:29: Presentation on sustainable HPC building block development in Julia
1:27:56: Walk through solutions of notebook 3: Using shared memory
1:37:35: Walk through solutions of notebook 4: Steering registers and using warp level functions
1:57:02: Walk through solutions of notebook 5: Distributed parallelization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AdvancedCUDA

AdvancedCUDA

README.md

Advanced GPU Computing course

Part 1

Part 2

Files

AdvancedCUDA

Directory actions

More options

Directory actions

More options

Latest commit

History

AdvancedCUDA

Folders and files

parent directory

README.md

Advanced GPU Computing course

Part 1

Part 2