ECE 8823: GPU Architectures – Spring 2018

ECE 8823: GPU Architectures – Spring 2018

Prerequisite: ECE 4100/6100 or CS 4290/6290

Course Description: The last 8 years has seen the emergence of general-purpose graphics processing units (GPUs) as vehicles for accelerating general purpose scientific, enterprise, and embedded applications. This emergence has coincided with the explosive growth of data parallel applications and the ascendance of energy efficiency as a driver of performance scalability. The research community has evolved a body of compiler and microarchitecture knowledge to address important bottlenecks to harnessing the enormous throughput and memory bandwidth of modern GPUs. This course introduces the basic organizational principles of the major components of a general purpose graphics processing unit (GPU) architecture. The course begins with coverage of a commodity language (CUDA) that implements the Single Instruction Multiple Thread (SIMT) programming model and introduces basic programming abstractions and idioms. It the provides an in-depth coverage of important microarchitecture concepts and performance optimizations for the efficient implementation of the SIMT model, elaborating state of the art techniques for performance optimization through coverage of the latest papers in leading international journals and conferencs augmented with key patents and class notes. A series of programming assignments and class project reinforce these concepts.

Course Texts:
D. Kirk and W. Hwu, “Programming Massively Parallel Processors: A Hands On Approach,” Morgan Kaufmann (pubs), Second Ed., 2012, ISBN 978-0-12-415992-
Journal & Conference papers, patents, class notes (distributed in Tsquare under Resources)

Publisher Website: Supplemental Material
Course SyllabusSyllabus
Class Resources
: Resource Page
Planned Lecture and Assignment Schedule: Schedule
(note this can change +/- a few days depending on progress)

Instructor: Sudhakar Yalamanchili
Contact Information: KACB 2316, Email:, Tel: 404-894-2940
Office Hours:  T 2:30 – 4, WF 3:00 – 4:30, Other times by appointment (see schedule at Sudhakar Yalamanchili)

Class TAs:
Blaise Tine,
Office hours: M 3-4:30 pm, Th 3 – 4 pm
Location: KACB 2305

Sana Damani,
Office Hours: M 10:30-11:30 am, Th 11 – 12 pm
Location: KACB 2305

Exam Schedule:
Midterm: Monday, March 5th, 2018
Final Exam : Wednesday, May 2nd, 2018, 11:30 am – 2:20pm

Look to TSquare for distribution and submission of assignments and associated materials such as conference and journal papers

Attendance: Students are responsible for all material covered in class, including changes in exam schedules announced in class. Make-up exams will be considered only if the student informs the instructor of the absence prior to the exam date, or, when prior information was not possible, immediately following the exam. Make-up exams will not be the same as the exam given in class.

Academic Honesty: Although students are encouraged strongly to work together to learn the course material, all students are expected to complete assignments and exams individually, following all instructions stated in conjunction with the assignments and exam. All conduct in this course will be governed by the Georgia Tech honor code. Additionally, it is expected that students will respect their peers and the instructor such that no one takes unfair advantage of anyone else associated with the course. Any suspected cases of academic dishonesty will be reported to the Dean of Students for further action. Additional details of the  Georgia Tech  honor  code can be found here.

Last Updated Module Lecture Reading Notes/Additional References
0 Overview (pptx, pdf)
1 Introduction (pptx, pdf) Chapter 1, Section 2.2, Section 2.3
 1/12/2018 2 Introduction to CUDA C (ppt, pdf) Chapter 3
1/26/2018 (II)
2/1/2018 (OpenCL
3 Data Parallel Execution – I (pptx, pdf)
Data Parallel Execution – II (pptx, pdf)
Data Parallel Execution – III (pptx, pdf)
Introduction to OpenCL 2.0 (pptx, pdf)
Chapter 4   Occupancy calculator
 1/18/2018 4 CUDA Memory Model (pptx, pdf)
Shared Memory (pptx, pdf)
Chapter 5
 2/2/2018 (I)
1/27/2018 (II)
5 Program Mapping and Execution I (pptx, pdf)
Program Mapping and Execution II (ptx, pdf)
Chapter 6
Chapter. 8.4
2/13/2018 (II)
6 Microarchitecture – I: Kernel Execution (pptx, pdf)
Microarchitecture -II: SM Microarchitecture (pptx,pdf)
Microarchitecture – III Register File (pptx, pdf)
See posted papers and patents in Tsquare
Harmonica GPU
4/3/2018 (II)
3/14/2018 (CCC)
3/27/2018 (HARP & A-4)
7 Control Divergence – I (pptx, pdf)
Control Divergence – II (pptx, pdf)
See posted conference and journal papers in Tsquare CCC Slides (pptx, pdf)
HARP Overview (pptx, pdf)
HARP Assignment 4 (pptx, pdf)
8  Memory Divergence & Optimizations
 4/18/2018  9  Scheduling (pptx, pdf)  See posted conference and journal papers.
 4/18/2018 10 Power & Energy Optimizations  (pptx, pdf) See posted conference and journal papers.
Dataflow (ppt, pdf)
Systolic Computation (ppt, pdf)
See posted conference and journal papers.
11 Advanced Topics
12 Webinars