Red Cloud with MATLAB/Queues

From CAC Documentation wiki
Jump to navigation Jump to search

MATLAB job execution queues on Red Cloud

When MATLAB sends a job to Red Cloud for parallel/distributed execution, that job enters an execution queue where it waits until resources are available. Different queues may have their own job priority policies, execution limits, hardware resources, or other exposed capability. The CAC client for Red Cloud allows selecting the desired queue from the MATLAB console via:

>> ClusterInfo.setQueueName('QUEUE_NAME')

This setting is persisted locally across MATLAB sessions until it is changed. If no value has ever been explicitly set, "Default" is assumed. To see the current setting, type

>> ClusterInfo.getQueueName

A list of available queues along with the number of processing cores is available by running

>> getRedCloudStatus()

Default

The Default queue is the standard queue for most CPU-intensive jobs. Typically, this queue is configured to have the most CPU cores available. It can accommodate long-running distributed or parallel (including pool) jobs.

  • In the current configuration, 45 CPU cores are available in this queue.
    • Accordingly, 45 is the maximum number of workers you can request for a parallel or pool job. Submission will fail if you submit a job with higher limits!
    • For a distributed job, any number of tasks can be requested, but no more than 52 will run at the same time.
  • Every core in this queue has access to 2GB of RAM.

Quick

The Quick queue is a quick turnaround queue to enable high throughput jobs requiring 10 minutes or less. Jobs submitted to this queue must be explicitly configured to terminate in 10 minutes or less via ClusterInfo.setWallTime(10), otherwise they will fail due to policy violation.

  • The Quick queue is currently configured with 4 cores.
    • There is thus a limit of 4 on the number of workers you can request for a parallel or pool job. Submission will fail if you submit a job with higher limits!
  • Every core in this queue also has access to 2GB of RAM.

In order to select the Quick queue and limit subsequent jobs to 10 minutes or less, type in MATLAB:

>> ClusterInfo.setQueueName('Quick');
>> ClusterInfo.setWallTime(10);

GPU

The GPU queue executes jobs on hardware containing a GPU which is accessible to MATLAB. In this queue, MATLAB GPU computing functions are available for general use.

  • In the current configuration, there are 7 CPU cores; every CPU core has its own attached GPU.
    • For a distributed job, each core/GPU pair will be assigned an independent task.
    • For a parallel or pool job, multiple core/GPU pairs can work in concert. (But note that in a pool job, the GPU of the "master" process often remains idle.)
  • Every CPU can access up to 10GB of RAM, while its attached GPU has 6GB of its own RAM.

Any of the currently supported MATLAB releases can run jobs on the GPU. However, we recommend that you always use the latest software, because GPU functionality has been steadily improving with every release.

Example

Create a function that utilizes GPUs, e.g. here is a simple function that compares the time to compute the fft of a vector of 100 million random numbers on the CPU and on the GPU.

function [nongpu,gpu] = gputest()
t = rand(100000000,1);
tic; fft(t); nongpu=toc;
tic; gt = gpuArray(t); gather(fft(gt)); gpu=toc;
end 

Specify the GPU queue, then create and run the job. The result shows that the GPU fft calculation is at least twice as fast than the CPU version when performed on a vector of 100 million numbers.

>> ClusterInfo.setQueueName('GPU');
>> sched = findResource('scheduler','configuration','cacscheduler');
>> j = createJob();
>> createTask(j,@gputest,2);
>> j.FileDependencies = {'gputest.m'};
>> submit(j);
>> wait(j);
Downloading completed job: Job1.
>> j.getAllOutputArguments

ans =

    [7.4863] [2.5106]

Hardware Details

The information below pertains to the system as configured at its launch in Oct. 2011. Since that time, one node with one 8-core processor and one GPU has been removed from service.

  • The CPU cores are Intel Xeon E5620 "Westmere-EP" cores running at 2.4 GHz.
    • The eight E5620 processors have 8 cores each, yielding a total of 64 cores for the resource.
    • The CPUs are maintained in Dell C6100 rack servers.
  • The 8 GPUs are Tesla M2070s, each capable of over 1 Tflop/s in single precision.
    • All GPUs are housed in a Dell C410x expansion chassis.