Parallel programming requires a number of different efforts and techniques than single-core programming. In this course, you will learn programming models, parallel architectures, and optimization techniques for programming in multi-core and GPU. It aims to practically experience parallel programming and optimization of various applications such as matrix operation, reduction, and DNN by utilizing Std thread, OpenMP, and Cuda.