ATLAS2 Cluster
Getting Started
General Information
- ATLAS2 is a private cluster with restricted access to the bs54_0001 group.
- Head node: atlas2.cac.cornell.edu (access via ssh)
- 55 compute nodes c00[01-16, 31-48,50-70]
- Current Cluster Status: Ganglia.
- Please send any questions and report problems to: cac-help@cornell.edu
How To Login
- To get started, login to the head node atlas2.cac.cornell.edu via ssh.
- If you are unfamiliar with Linux and ssh, we suggest reading the Linux Tutorial and looking into how to Connect to Linux before proceeding.
- You will be prompted for your CAC account password
Hardware
All nodes have hyperthreading turned on and are Xeon generations that supports vector extensions: SSE4.2.
Node Names Memory per node Model name Processor count per node Core(s) per socket Sockets Thread(s) per core c00[01-12] 94GB Intel(R) Xeon(R) CPU X5690 @ 3.47GHz 24 6 2 2 c00[13-16] 47GB Intel(R) Xeon(R) CPU X5670 @ 2.93GHz 24 6 2 2 c00[31-48,50-58] 47GB Intel(R) Xeon(R) CPU X5670 @ 2.93GHz 24 6 2 2 c00[59-70] 47GB Intel(R) Xeon(R) CPU X5690 @ 3.47GHz 24 6 2 2
Networking
- All nodes have a 1GB ethernet connection for eth0 on a private net served out from the atlas2 head node.
- All nodes have an Infiniband connection:
- InfiniPath_QLE7340n (QDR speed, 8Gbits/sec)
Running Jobs with Slurm
For detailed information and a quick-start guide, see the Slurm page.
ATLAS2 Queues/Partitions
("Partition" is the term used by Slurm)
- hyperthreading is turned on for ALL nodes
- all partitions have a default time of 1 hour
- ATLAS2 has 5 separate queues:
Queue/Partition Number of nodes Node Names Limits short (default) 31 c00[13-16,31-48,50-58] walltime limit: 4 hours long 22 c00[13-16,31-48] walltime limit: 504 hours inter ~Interactive 12 c00[59-70] walltime limit: 168 hours bigmem 12 servers c00[01-12] Maximum of 12 nodes, walltime limit: 168 hours normal 55 servers c00[01-16, 31-48,50-70] walltime limit: 4 hours
Example in Short Partition/Queue
Example sbatch file to run a job in the short partition/queue; save as example.sh:
#!/bin/bash ## J sets the name of job #SBATCH -J TestJob ## -p sets the partition (queue) #SBATCH -p long ## 10 min #SBATCH --time=00:10:00 ## sets the tasks per core (default=2; keep default if you want to take advantage of hyperthreading) ## 2 will take whole cores, but will divide by 2 with hyperthreading #SBATCH --ntasks-per-core=1 ## request 4GB per CPU (may limit # of tasks, depending on total memory) #SBATCH --mem-per-cpu=4GB ## define jobs stdout file #SBATCH -o testlong-%j.out ## define jobs stderr file #SBATCH -e testlong-%j.err echo "starting at `date` on `hostname`" # Print the SLURM job ID. echo "SLURM_JOBID=$SLURM_JOBID" echo "hello world `hostname`" echo "ended at `date` on `hostname`" exit 0
Submit/Run your job:
sbatch example.sh
View your job:
scontrol show job 9
Software
The cluster is managed with OpenHPC, which uses yum to install available software from the installed repositories.
- To view all options of yum, type:
man yum
- To view installed repositories, type:
yum repolist
- To view if your requested software package is in one of the installed repositories, use:
yum search <package>
- i.e. To search whether variations of tau are available, you would type:
- To view all options of yum, type:
yum search tau
Installed Software
(sortable table)
Package and Version Location module available Notes cplex studio 128 /opt/ohpc/pub/ibm/ILOG/CPLEX_Studio128/ cplex/12.8 cuda toolkit 9.0 /opt/ohpc/pub/cuda-9.0 cudnn 9.0 in targets/x86_64-linux/lib/ cuda toolkit 9.1 /opt/ohpc/pub/cuda-9.1 cudnn 9.1 in targets/x86_64-linux/lib/ cuda toolkit 9.2 /opt/ohpc/pub/cuda-9.2 cudnn 9.2 in targets/x86_64-linux/lib/ cuda toolkit 10.0 /opt/ohpc/pub/cuda-10.0 cudnn 7.4.1 for cuda10 in targets/x86_64-linux/lib/ gcc 7.2.0 /opt/ohpc/pub/compiler/gcc/7.2.0/bin/gcc gnu7/7.2.0 gcc 4.8.5 (default) /usr/bin/gcc gdal 2.2.3 /opt/ohpc/pub/gdal2.2.3 gdal/2.2.3 java openjdk 1.8.0 /usr/bin/java Python 2.7.5 (default) /usr/bin/python The system-wide installation of packages is no longer supported. See below for Anaconda/miniconda install information. R 3.5.1 /usr/bin/R The system-wide installation of packages is no longer supported. Subversion (svn) 1.7 /usr/bin/svn
- It is usually possible to install software in your home directory.
- List installed software via rpms: 'rpm -qa'. Use grep to search for specific software: rpm -qa | grep sw_name [i.e. rpm -qa | grep perl ]
Modules
Since this cluster is managed with OpenHPC, the Lmod Module System is implemented. You can see detailed information and instructions at the linked page.
Example:
To be sure you are using the environment setup for cplex
, you would type:
$ module avail $ module load cplex
When done, either logout and log back in or type module unload cplex
You can also create your own modules and place them in your $HOME. For instructions, see the Modules (Lmod) page.
Once created, type module use $HOME/path/to/personal/modulefiles
. This will prepend the path to $MODULEPATH
. Type echo $MODULEPATH
to confirm.
Build software from source into your home directory ($HOME)
* download and extract your source * cd to your extracted source directory * ./configure --./configure --prefix=$HOME/appdir [You need to refer to your source documentation to get the full list of options you can provide 'configure' with.] * make * make install The binary would then be located in ~/appdir/bin. * Add the following to your $HOME/.bashrc: export PATH="$HOME/appdir/bin:$PATH" * Reload the .bashrc file with source ~/.bashrc. (or logout and log back in)
How to Install R packages in your home directory
Reference: http://cran.r-project.org/doc/manuals/R-admin.html#Managing-libraries
************************************************************************************
NOTE: Steps 1) through 4) need to be done once, after your Rlibs directory
has been created and your R_LIBS environment is set, you can install additional
packages using step 5).
************************************************************************************
Know your R library search path:
Start R and run .libPaths() Sample output is shown below:
> .libPaths()
[1] "/usr/lib64/R/library"
Now we will create a local Rlibs directory and add this to the library search path.
NOTE: Make sure R is NOT running before you proceed.
1) Create a directory in your home directory you would like to install the R packages, e.g. Rlibs
mkdir ~/Rlibs
2) Create a .profile file in your home directory (or modify existing) using your favorite editor (emacs, vim, nano, etc)
Add the following to your .profile
#!/bin/sh
if [ -n $R_LIBS ]; then
export R_LIBS=~/Rlibs:$R_LIBS
else
export R_LIBS=~/Rlibs
fi
3) To reset the R_LIBS path we need to run the following: "source ~/.profile" (or logout and log back in)
4) Confirm the change is in your library path:
start R
> .libPaths()
[1] "$HOME/Rlibs"
[2] "/usr/lib64/R/library"
5) Install the package in your local directory
>install.packages("packagename","~/Rlibs","https://cran.r-project.org/")
i.e. to install the package:snow
>install.packages("snow","~/Rlibs","https://cran.r-project.org/")
6) For more help with install.packages() use
>?install.packages( )
7) To see which libraries are available in your R library path, run library()
The output will show your local packages and the system wide packages
>library()
How to Install Python Anaconda (miniconda) home directory
- Anaconda can be used to maintain custom environments for R (as well as other software).
- Reference to help decide if miniconda is enough: https://conda.io/docs/user-guide/install/download.html
- NOTE: Consider starting with miniconda if you do not need a multitude of packages for it will be smaller, faster to install as well as update.
- Reference for Anaconda R Essentials: https://conda.io/docs/user-guide/tasks/use-r-with-conda.html
- Reference for linux install: https://conda.io/docs/user-guide/install/linux.html
- Please take the tutorials to assist you with your management of conda packages:
https://conda.io/docs/user-guide/tutorials/index.html
MPI
- To use MPI with Infiniband, use openmpi3
- The mpich default transport is TCP/IP (not ininiband)
- NOTE: mvapich2/2.2 will NOT work on this cluster