Difference between revisions of "GPUs in Red Cloud"

From CAC Documentation wiki
Jump to navigation Jump to search
 
(11 intermediate revisions by 3 users not shown)
Line 1: Line 1:
''(This page under development)''
+
[[Red Cloud]] supports GPU computing featuring '''[https://www.nvidia.com/en-us/data-center/tesla-t4/ Nvidia Tesla T4]''' and '''[https://www.nvidia.com/en-us/data-center/tesla-v100/ Nvidia Tesla V100]''' GPUs. To use a GPU, launch an instance with one of the following 2 flavors (instance types):
 
 
[[Red Cloud]] supports GPU computing.  To use a GPU in Red Cloud, you must select a flavor of machine with a GPU. Red Cloud features '''[https://www.nvidia.com/en-us/data-center/tesla-t4/ Nvidia Tesla T4]''' and '''[https://www.nvidia.com/en-us/data-center/tesla-v100/ Nvidia Tesla V100]''' GPUs in these flavors:
 
  
 
{| border="1" cellspacing="0" cellpadding="10" align="center" style="text-align:center;"
 
{| border="1" cellspacing="0" cellpadding="10" align="center" style="text-align:center;"
Line 15: Line 13:
  
 
== Availability ==
 
== Availability ==
Red Cloud has 20 T4s and 4 V100s available.  You can see how many are in use at any given time [https://gpus.redcloud.cac.cornell.edu/usage here]. Always check to see if GPU resources are available to start your instance. If you are new to Red Cloud you should review [[Red_Cloud#How_To_Read_This_Documentation|how to read this documentation]] before launching an instance, especially the section on [[Red_Cloud#Accounting:_Don.27t_Use_Up_Your_Subscription_by_Accident.21|accounting]]. GPUs are not oversubscribed.  This means when you create a GPU instance with a certain number of GPUs, you are reserving the physical hardware for the duration of the life of your instance unless it is '''''[[OpenStack#Instance_States|shelved]]''''' to free the resources. If the resources are not available when you attempt to start an instance - because someone else has reserved them - then you will receive an error that they cannot be created.  Therefore, it is good to check availability before starting an instance, and also shelving instances when not in use.
+
Red Cloud has 20 T4 GPUs and 4 V100 GPUs.  You can see how many are available for use [https://gpus.redcloud.cac.cornell.edu/usage here]. If no GPU is available, you will receive an error when launching a GPU instance.  
 +
 
 +
Red Cloud resources (CPU cores, RAM, GPUs) are not oversubscribed.  When you create a GPU instance, you are reserving the physical hardware for the duration of the life of your instance (and your subscription will be charged accordingly) until the instance is deleted or '''''[[OpenStack#Instance_States|shelved]]''''' to free the resources.  
 +
 
 +
'''If you are new to Red Cloud''' please review [[Red_Cloud#How_To_Read_This_Documentation|how to read this documentation]] before launching an instance, especially the section on [[Red_Cloud#Accounting:_Don.27t_Use_Up_Your_Subscription_by_Accident.21|accounting]].
  
 
== Launching A GPU Instance ==
 
== Launching A GPU Instance ==
When '''[[OpenStack#Launch_an_Instance|launching an instance]]''', you can use either the base [[Red_Cloud_Linux_Instances|Linux]] or [[Red_Cloud_Windows_Instances|Windows]] instances and install your own GPU libraries, or select CUDA source images such as (...).  Next, select a GPU-enabled flavor and configure the instance as you would any other instance.  Once your instance is launched, you will have access to the GPU within the VM and can install software (e.g., pytorch, tensorflow) that will use the GPU.
+
When '''[[OpenStack#Launch_an_Instance|launching a GPU instance]]''', you can use the base [[Red_Cloud_Linux_Instances|Linux]] or [[Red_Cloud_Windows_Instances|Windows]] image and install your own software or libraries that utilizes the GPU. To speed up time to science, CAC also provides 2 Linux GPU images with GPU software installed.
  
 
For more information on GPU and CUDA computing, see the Cornell Virtual Workshop "'''[https://cvw.cac.cornell.edu/gpu/ Introduction to GPGPU and CUDA Programming: Overview]'''"
 
For more information on GPU and CUDA computing, see the Cornell Virtual Workshop "'''[https://cvw.cac.cornell.edu/gpu/ Introduction to GPGPU and CUDA Programming: Overview]'''"
Line 25: Line 27:
  
 
* [https://redcloud.cac.cornell.edu/dashboard/ngdetails/OS::Glance::Image/e096c762-473c-440b-9516-19211c255ad2 gpu-accelerated-ubuntu-2020-08] (based on Ubuntu 18.04 LTS)
 
* [https://redcloud.cac.cornell.edu/dashboard/ngdetails/OS::Glance::Image/e096c762-473c-440b-9516-19211c255ad2 gpu-accelerated-ubuntu-2020-08] (based on Ubuntu 18.04 LTS)
* gpu-accelerated-centos-2020-08 (based on CentOS 7.8)
+
* [https://redcloud.cac.cornell.edu/dashboard/ngdetails/OS::Glance::Image/516f21bc-07a8-4546-b052-982028a3d04e gpu-accelerated-centos-2020-08] (based on CentOS 7.8)
  
 
These images include the following software:
 
These images include the following software:
 
# CUDA 10.1
 
# CUDA 10.1
# Anaconda python with these packages
+
# Anaconda Python 3 with these packages
## Tenserflow
+
## TensorFlow
 
## PyTorch
 
## PyTorch
 
## Keras
 
## Keras
Line 36: Line 38:
 
# Matlab R2019a.
 
# Matlab R2019a.
  
See [[Red Cloud GPU Image Usage Documentation]] for more details and sample code.
+
See '''[[Red Cloud GPU Image Usage]]''' page for more details and sample code.

Latest revision as of 13:27, 5 November 2020

Red Cloud supports GPU computing featuring Nvidia Tesla T4 and Nvidia Tesla V100 GPUs. To use a GPU, launch an instance with one of the following 2 flavors (instance types):

Flavor CPUs GPUs RAM
c4.t1.m20 4 1 Nvidia Tesla T4 20 GB
c14.g1.m60 14 1 Nvidia Tesla V100 60 GB

Availability

Red Cloud has 20 T4 GPUs and 4 V100 GPUs. You can see how many are available for use here. If no GPU is available, you will receive an error when launching a GPU instance.

Red Cloud resources (CPU cores, RAM, GPUs) are not oversubscribed. When you create a GPU instance, you are reserving the physical hardware for the duration of the life of your instance (and your subscription will be charged accordingly) until the instance is deleted or shelved to free the resources.

If you are new to Red Cloud please review how to read this documentation before launching an instance, especially the section on accounting.

Launching A GPU Instance

When launching a GPU instance, you can use the base Linux or Windows image and install your own software or libraries that utilizes the GPU. To speed up time to science, CAC also provides 2 Linux GPU images with GPU software installed.

For more information on GPU and CUDA computing, see the Cornell Virtual Workshop "Introduction to GPGPU and CUDA Programming: Overview"

GPU images

These images include the following software:

  1. CUDA 10.1
  2. Anaconda Python 3 with these packages
    1. TensorFlow
    2. PyTorch
    3. Keras
  3. Docker-containerized Jupyter Notebook servers, and
  4. Matlab R2019a.

See Red Cloud GPU Image Usage page for more details and sample code.