GPUs in Red Cloud

From CAC Documentation wiki
Revision as of 09:51, 24 August 2020 by Pzv2 (talk | contribs) (Added the Availability section)
Jump to navigation Jump to search

(This page under development)

Red Cloud supports GPU computing and has two different GPU instance flavors that allow for GPU computing on virtual machines (VMs). In order to use a GPU in Red Cloud, you will need to select a flavor of machine with a GPU. Currently Red Cloud instances feature Nvidia Tesla T4 and Nvidia Tesla V100 GPU accelerators in these flavors:

Flavor CPUs GPUs RAM
*c4.t1.m20 4 1 Nvidia Tesla T4 20 GB
*c14.g1.m60 14 1 Nvidia Tesla V100 60 GB

Availability

As of this writing, there are 20 of the T4s and 4 of the V100s for use in Red Cloud VMs. You can obtain up-to-date GPU usage here to help you to determine if resources are available to start an instance. If you are new to Red Cloud you should review how to read this documentation before launching an instance, especially the section on accounting. Because Red Cloud does not have hyperthreading enabled, GPUs are not oversubscribed. This means when you create a GPU instance with a certain number of GPUs, you are reserving the physical hardware for the duration of the life of your instance unless it is shelved to free the resources. If the resources are not available when you attempt to start an instance - because someone else has reserved them - then you may receive an error that they cannot be created. Therefore, it is good to check availability before starting an instance, and also shelving instances when not in use.

Launching A GPU Instance

When launching an instance, you can use either the base Linux or Windows instances and install your own GPU libraries, or select CUDA source images such as (...), select a GPU-enabled flavor, and configure the instance as you would any other instance.

Once your instance is launched, you will have access to the GPU within the VM and can install software (pytorch, tensorflow) that will use the GPU.

For more information on GPU and CUDA computing, see the Cornell Virtual Workshop "Introduction to GPGPU and CUDA Programming: Overview"