-
Notifications
You must be signed in to change notification settings - Fork 7
Home
rshipley160 edited this page Dec 5, 2021
·
52 revisions
Welcome to the knowledge base. All you need to get started is a CUDA-enabled device that you can run the example programs on, a basic understanding of C/C++ (some of the more complex elements like memory management and pointers will be reviewed as needed), and the desire to learn.
- What is Parallel Computing?
- What is a GPU?
- Basic CUDA Syntax
- Memory Management on the GPU
a. CUDA Memory Types
b. Using CUDA Memory - Performance Experiment: On-GPU vs Off-GPU Bandwidth
- Thread and Block Scheduling
- Common Parallel Applications
a. Reduction
b. Matrix Multiplication
- Intro to Asynchronous Computing
- CUDA Streams
- Asynchronous Memory Transfers
- Performance Experiment: Multi-stream Parallelism
- Basic Synchronization Methods
- Events and Dependencies
- Performance Experiment: Event-Based Synchronization vs Explicit Synchronization
-
CUDA Graphs
a. Creating graphs with stream capture
b. Creating graphs explicitly
c. Using host functions with graphs & streams d. Graph node glossary - Performance Experiment: Graphs vs Streams vs Synchronous Kernels
- Performance Experiment: Increasing the Amount of Graph Nodes