Online class
Instructor: Duksu Kim (bluekds at koreatech.ac.kr / #435, 2nd Eng. Building)
Office hour : Mon. 14:00~16:00
(Required) C Programming
(Strongly Recommended) Multi-core Programming (Undergraduate level)
(Recommended) System Programming, Data structure
(Required) PC or Laptop with a multi-core CPU / (Recommended) PC or Laptop with a NVIDIA GPU
We will rent a development kit (e.g., Jetson kit) for CUDA if you need
However, you need to prepare a monitor and a keyboard/mouse yourself to use that.
Windows Dev. environments [Kor]
Linux(Ubuntu) Dev. environments on Jetson Kit [Kor]
Trouble shooting
Q. My laptop has a Nvidia GPU, but CUDA does not work properly
A. Check the GPU system on your laptop whether a hybrid GPU system (e.g., Intel HD graphics + Nvidia GPU)
In this case, disabling the intel GPU on the device manager of you OS may fix the problem
What is and Why Parallel Computing
Parallel Program Performance
Parallel Program Design
Parallel Processing Hardware
Heterogeneous Computing
OpenMP introduction
Parallel construct
Work-sharing construct
Scope of Variables
Synchronization construct & Locks
Nested parallelism
Introduction to GPGPU
Hello CUDA
Basic Workflow of CUDA
CUDA Thread Hierarchy
Organizing Threads
CUDA Execution Model
CUDA Memory Model & Performance
Using Shared Memory
Maximizing Memory Throughput
Synchronization
CUDA Stream & Concurrent Execution
CUDA Event
Multi-GPUs and Heterogeneous Computing
Lecture slides
Fast Filtering of LiDAR Point Cloud in Urban Areas Based on Scan Line Segmentation and GPU Acceleration [paper]
Energy-efficient excution of data-parallel application on heterogeneous mobile platforms [paper]
Safety view management for augmented reality based on MapReduce strategy on multi-core processors [paper]
Real-Time Face Detection and Tracking Utilising OpenMP and ROS [paper]
M-DTM: Migration-based dynamic thermal management for heterogeneous mobile multi-core processors [paper]
Parallel Processing for Data Deduplication [paper]
P Sobe et al, PARS-Mitteilungen, 2015
Presented by In-Chul Hwang (Slides) (Video)
Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems [paper]
Parallel K Nearest Neighbor Matching for 3D Reconstruction [paper]
Parallel Scheduled Sampling [paper]
Duckworth et al., arXiv:1906.04331, Jun 2019
Presented by Jin-Hwan Kim (Slides) (Video)
Neural Network Implementation using CUDA and OpenMP [paper]
Jang et al., Digital Image Computing: Techniques and Applications, 2008
Presented by Jae-Min Sa (Slides) (Video)
SandTrap: Trackiing information flows on demand with parallel permissions [paper]
Razeen et al, MobiSys, 2018
Presented by Euihyeok Lee (Slides) (Video)
CPU and GPU Parallel Processing for Mobile Augmented Reality [paper]
Baek et al., International Congress on Image and Signal Processing (CISP), 2013
Presented by Juhwan Lee (Slides) (Video)
Light Field Depth Estimation on Off-the-Shelf Mobile GPU [paper]
Ivan et al, CVPR Workshops, 2018
Presented by Ye-Chan Choi (Slides) (Video)
DeepSense: A GPU-based Deep Convolutional Neural Network Framework on Commodity Mobile Devices [paper]
Huynh et al., Workshop on Wearable Systems and Applications, 2016
Presented by Sang-Won Hwang (Slides) (Video)
Performance and Scalability of GPU-based Convolutional Neural Networks [paper]
Strigl et al., IEEE Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010
Presented by Joon-Ho Park (Slides) (Video)
Flexible, High Performance Convolutional Neural Networks for Image Classification [paper]
Curesan et al., Twenty-Second International Joint Conference on Artificial Intelligence, 2011
Presented by Joon-Ho Park (Slides) (Video)