Skip to main content


This online course is sponsored by the Extreme Science and Engineering Discovery Environment (XSEDE), which provides resources, data, tools, and expert support to researchers and educators.
Applications of Parallel Computers
Jim Demmel
Jim Demmel

The set of lectures below is an online rendition of Applications of Parallel Computers taught at U.C. Berkeley in Spring 2012. The course was taught by Professor Jim Demmel with assistance from Teaching Assistants Nick Knight and Brian Van Straalen.

This course teaches both graduate and advanced undergraduate students from diverse departments how use parallel computers both efficiently and productively, i.e. how to write programs that run fast while minimizing programming effort. The latter is increasingly important since essentially all computers are (becoming) parallel, from supercomputers to laptops. The course will be taught online and includes lecture materials, quizzes, and programming exercises. As a prerequisite students should ideally have some programming experience in C or similar language.

This online course was produced by CAC with materials shared by Jim Demmel.

Note: The materials occationally mention lab machines that were used in an earlier course offering. Details for the assignments including which XSEDE machines to be used will be provided separately in the assignment links and pages.


Note: Work through lectures in the order shown below. When referring to lecture number, please use the number in the list below or between the red lines on that lecture's web page. Do not refer to the lecture number shown in the video, as the lectures were originally given in a different order.

  1. Introduction
  2. Single Processor Machines: Memory Hierarchies and Processor Features
  3. Introduction to Parallel Machines and Programming Models
  4. Sources of Parallelism and Locality in Simulation - Part 1
  5. Sources of Parallelism and Locality in Simulation - Part 2
  6. Shared Memory Programming: Threads and OpenMP, and Tricks with Trees
  7. Distributed Memory Machines and Programming
  8. Partitioned Global Address Space Programming with Unified Parallel C
  9. Introduction to GPUs by Bryan Catanzaro
  10. Dense Linear Algebra - Part 1
  11. Dense Linear Algebra - Part 2
  12. Graph Partitioning
  13. Automatic Performance Tuning and Sparse-Matrix-Vector-Multiplication (SpMV)
  14. Hierarchical Methods for the N-Body Problem
  15. Structured Grids
  16. Cloud Computing with MapReduce and Hadoop by Matei Zaharia
  17. Architecting Parallel Software with Patterns by Kurt Keutzer
  18. Parallel Fast Fourier Transform (FFT)
  19. Dynamic Load Balancing

Guest lectures: