Each node of the Stampede machine contains 16 cores with shared access to 32 GB of memory. This module is about how to write programs that can effectively use these cores for concurrent computation using OpenMP.
At the node level, you have several options for utilizing the 16 cores. You could run 16 separate tasks there, communicating among them using messages through the MPI interface, or you could run a single task that utilizes 16 threads for concurrent computation using the OpenMP interface, or you could use some combination of these two. Programming using MPI is introduced in the MPI module.
Programs that require more than 16 cores will use several nodes of the Stampede machine. Since memory is not shared between nodes, MPI must be used for communicating between tasks on separate nodes, but these programs can use OpenMP to exploit the multiple cores within a node in a style that is called hybrid programming.