Difference between revisions of "THECUBE Cluster"
Line 1: | Line 1: | ||
This is a private cluster. | This is a private cluster. | ||
− | =Hardware= | + | ==Hardware== |
:* Head node: thecube.cac.cornell.edu. | :* Head node: thecube.cac.cornell.edu. | ||
:* access modes: ssh | :* access modes: ssh | ||
Line 9: | Line 9: | ||
:* Submit HELP requests: [http://www.cac.cornell.edu/help help] OR by sending email to: help@cac.cornell.edu, please include THECUBE in the subject area. | :* Submit HELP requests: [http://www.cac.cornell.edu/help help] OR by sending email to: help@cac.cornell.edu, please include THECUBE in the subject area. | ||
− | =File Systems= | + | ==File Systems== |
− | ==Home Directories== | + | ===Home Directories=== |
:* Path: ~ | :* Path: ~ | ||
Line 17: | Line 17: | ||
'''Unless special arrangements are made, data in user's home directories are NOT backed up.''' | '''Unless special arrangements are made, data in user's home directories are NOT backed up.''' | ||
− | ==Scratch File System== | + | ===Scratch File System=== |
LUSTRE file system provided by Terascala and Dell | LUSTRE file system provided by Terascala and Dell | ||
:* Path: /scratch/<user name> | :* Path: /scratch/<user name> | ||
Line 23: | Line 23: | ||
The scratch file system is a fast parallel file system. Use this file system for scratch space for your jobs. Copy the results you want to keep back to your home directory for safe keeping. | The scratch file system is a fast parallel file system. Use this file system for scratch space for your jobs. Copy the results you want to keep back to your home directory for safe keeping. | ||
− | =Scheduler/Queues= | + | ==Scheduler/Queues== |
:* Maui/Torque scheduler | :* Maui/Torque scheduler | ||
:* Queues: | :* Queues: | ||
Line 152: | Line 152: | ||
|} | |} | ||
− | =Quick Tutorial= | + | ==Quick Tutorial== |
The batch system treats each core of a node as a "virtual processor." That means the nodes keyword in batch scripts refers to the number of cores that are scheduled. | The batch system treats each core of a node as a "virtual processor." That means the nodes keyword in batch scripts refers to the number of cores that are scheduled. | ||
− | ==Running an MPI Job on the Whole Cluster== | + | ===Running an MPI Job on the Whole Cluster=== |
:*We are assuming /opt/openmpi/ is the default, which it is on thecube cluster. The mpiexec options may change depending on your selected MPI. | :*We are assuming /opt/openmpi/ is the default, which it is on thecube cluster. The mpiexec options may change depending on your selected MPI. | ||
:*First use showq to see how many cores are available. It may be less than 512 if a node is down. | :*First use showq to see how many cores are available. It may be less than 512 if a node is down. | ||
Line 194: | Line 194: | ||
mpiexec --hostfile $PBS_NODEFILE <executable> (substitute executable for the program you wish to run) | mpiexec --hostfile $PBS_NODEFILE <executable> (substitute executable for the program you wish to run) | ||
</source> | </source> | ||
− | *Submit the job to the cluster | + | :*Submit the job to the cluster |
<pre> | <pre> | ||
-sh-4.1$qsub runmyfile.sh | -sh-4.1$qsub runmyfile.sh | ||
Line 200: | Line 200: | ||
:*Look for the output file in a file named test. | :*Look for the output file in a file named test. | ||
− | ==Running an MPI Job using 16 Tasks Per Node== | + | ===Running an MPI Job using 16 Tasks Per Node=== |
Because the nodes have 16 physical cores, you may want to limit jobs to 16 tasks per node. | Because the nodes have 16 physical cores, you may want to limit jobs to 16 tasks per node. | ||
The node file lists each node 1 time, so make a copy with each node listed 16 times, and | The node file lists each node 1 time, so make a copy with each node listed 16 times, and | ||
Line 223: | Line 223: | ||
</source> | </source> | ||
− | ==Running Many Copies of a Serial Job== | + | ===Running Many Copies of a Serial Job=== |
In order to run 30 separate instances of the same program, use the scheduler's task array feature, through the "-t" option. The "nodes" parameter here refers to a core. | In order to run 30 separate instances of the same program, use the scheduler's task array feature, through the "-t" option. The "nodes" parameter here refers to a core. | ||
Line 241: | Line 241: | ||
When you start jobs this way, separate jobs will pile one-per-core onto nodes like a box of hamsters. | When you start jobs this way, separate jobs will pile one-per-core onto nodes like a box of hamsters. | ||
− | ==Running on a specific node== | + | ===Running on a specific node=== |
To run on a specific node use the host= option | To run on a specific node use the host= option | ||
Line 256: | Line 256: | ||
echo Run my job. | echo Run my job. | ||
</source> | </source> | ||
− | ==Running an interactive job== | + | ===Running an interactive job=== |
from the command line: | from the command line: | ||
qsub -l nodes=1 -I | qsub -l nodes=1 -I | ||
− | =HELP= | + | ==HELP== |
:* THECUBE Cluster Status: [http://thecube.cac.cornell.edu/ganglia/ Ganglia]. | :* THECUBE Cluster Status: [http://thecube.cac.cornell.edu/ganglia/ Ganglia]. | ||
:* Submit HELP requests: [http://www.cac.cornell.edu/help help] OR by sending email to: help@cac.cornell.edu, please include THECUBE in the subject area. | :* Submit HELP requests: [http://www.cac.cornell.edu/help help] OR by sending email to: help@cac.cornell.edu, please include THECUBE in the subject area. |
Revision as of 12:08, 21 September 2015
This is a private cluster.
Hardware
- Head node: thecube.cac.cornell.edu.
- access modes: ssh
- Rocks 6.1 with CentOS 6.3
- 32 compute nodes with Dual 8-core E5-2680 CPUs @ 2.7 GHz, 128 GB of RAM
- THECUBE Cluster Status: Ganglia.
- Submit HELP requests: help OR by sending email to: help@cac.cornell.edu, please include THECUBE in the subject area.
File Systems
Home Directories
- Path: ~
User home directories is located on a NFS export from the head node. Use your home directory (~) for archiving the data you wish to keep. Do NOT use this file system for computation as bandwidth to the compute nodes is very limited and will quickly be overwhelmed by file I/Os from large jobs.
Unless special arrangements are made, data in user's home directories are NOT backed up.
Scratch File System
LUSTRE file system provided by Terascala and Dell
- Path: /scratch/<user name>
The scratch file system is a fast parallel file system. Use this file system for scratch space for your jobs. Copy the results you want to keep back to your home directory for safe keeping.
Scheduler/Queues
- Maui/Torque scheduler
- Queues:
Name Description Time Limit default all nodes no limit
Software
Set up the working environment for each package using the module command. The module command will activate dependent modules if there are any.
To show currently loaded modules:
-sh-4.1$ module list Currently Loaded Modulefiles: 1) openmpi-1.6.5-intel-x86_64
To show all available modules (as of Sept 30, 2013):
-sh-4.1$ module avail ----------------------------------- /usr/share/Modules/modulefiles ----------------------------------- dot module-info null rocks-openmpi_ib module-cvs modules rocks-openmpi use.own ------------------------------------------ /etc/modulefiles ------------------------------------------ boost-1.54.0 mathematica-9.0 sas-9.3 cmake-2.8.11.2 matlab-r2013a valgrind-3.8.1 eclipse-4.3 netcdf-4.3.0 visit-2.5.2 hdf5-1.8.11 openmpi-1.6.5-intel-x86_64 zlib-1.2.8
To load a module and verify:
-sh-4.1$ module load mathematica-9.0 -sh-4.1$ module list Currently Loaded Modulefiles: 1) openmpi-1.6.5-intel-x86_64 2) mathematica-9.0
To unload a module and verify:
-sh-4.1$ module unload mathematica-9.0 -sh-4.1$ module list Currently Loaded Modulefiles: 1) openmpi-1.6.5-intel-x86_64
SOFTWARE LIST
Software Path Notes Intel Compilers (including MKL) /opt/intel - Included in user's default path.
Openmpi 1.6.5 /opt/openmpi - Included in user's default path.
Mathematica /opt/Mathematica - module load mathematica-9.0
Matlab /opt/MATLAB - module load matlab-r2013a
SAS /opt/SAS - module load sas-9.3
Boost /opt/boost - module load boost-1.54.0
cmake /opt/cmake - module load cmake-2.8.11.2
eclipse /opt/eclipse - module load eclipse-4.3
hdf5 /opt/hdf5 - module load hdf5-1.8.11
netcdf /opt/netcdf - module load netcdf-4.3.0
valgrind /opt/valgrind - module load valgrind-3.8.1
visit /opt/visit - module load visit-2.5.2
zlib /opt/zlib - module load zlib-1.2.8
acml /opt/acml - AMD Core Math Library
- no module file
- not in default path
R, ffmpeg /usr/bin - in default path
BLAS, LAPACK libraries - in default path
Thrust - Coming soon
Quick Tutorial
The batch system treats each core of a node as a "virtual processor." That means the nodes keyword in batch scripts refers to the number of cores that are scheduled.
Running an MPI Job on the Whole Cluster
- We are assuming /opt/openmpi/ is the default, which it is on thecube cluster. The mpiexec options may change depending on your selected MPI.
- First use showq to see how many cores are available. It may be less than 512 if a node is down.
-sh-4.1$ showq ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 0 Active Jobs 0 of 512 Processors Active (0.00%) 0 of 32 Nodes Active (0.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 0 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 0 Active Jobs: 0 Idle Jobs: 0 Blocked Jobs: 0
- Next create a script (using your favorite editor ex. vim) named runmyfile.sh that contains the following lines of code:
#!/bin/sh
#PBS -l nodes=32:ppn=16 (note, this is PBS -l (small case L))
#PBS -N test
#PBS -j oe
#PBS -S /bin/bash
set -x
cd "$PBS_O_WORKDIR"
mpiexec --hostfile $PBS_NODEFILE <executable> (substitute executable for the program you wish to run)
- Submit the job to the cluster
-sh-4.1$qsub runmyfile.sh
- Look for the output file in a file named test.
Running an MPI Job using 16 Tasks Per Node
Because the nodes have 16 physical cores, you may want to limit jobs to 16 tasks per node. The node file lists each node 1 time, so make a copy with each node listed 16 times, and hand that version to MPI.
#!/bin/sh
#PBS -l nodes=64
#PBS -N test
#PBS -j oe
#PBS -S /bin/bash
set -x
cd "$PBS_O_WORKDIR"
# Construct a copy of the hostfile with only 16 entries per node.
# MPI can use this to run 16 tasks on each node.
uniq "$PBS_NODEFILE"|awk '{for(i=0;i<16;i+=1) print}'>nodefile.16way
# to Run 16-way on 4 nodes, we request 64 core to obtain 4 nodes
mpiexec --hostfile nodefile.16way ring -v
Running Many Copies of a Serial Job
In order to run 30 separate instances of the same program, use the scheduler's task array feature, through the "-t" option. The "nodes" parameter here refers to a core.
#!/bin/sh
#PBS -l nodes=1 (note, this is PBS -l (small case L))
#PBS -t 30
#PBS -N test
#PBS -j oe
#PBS -S /bin/bash
set -x
cd "$PBS_O_WORKDIR"
echo Run my job.
When you start jobs this way, separate jobs will pile one-per-core onto nodes like a box of hamsters.
Running on a specific node
To run on a specific node use the host= option
#!/bin/sh
#PBS -l host=compute-1-16 (note, this is PBS -l (small case L))
#PBS -N test
#PBS -j oe
#PBS -S /bin/bash
set -x
cd "$PBS_O_WORKDIR"
echo Run my job.
Running an interactive job
from the command line: qsub -l nodes=1 -I