Difference between revisions of "Hopper Cluster"

From CAC Documentation wiki
Jump to navigation Jump to search
 
Line 17: Line 17:
  
 
===Access via Globus===
 
===Access via Globus===
:* Log into [https://globus.org Globus Web GUI] using your Cornell NetID
+
:* Globus Collection: [https://app.globus.org/file-manager?origin_id=34efa512-0991-11ec-ae2a-6b098c7117de Hopper Cluster]. See the [[File Transfer using Globus]] page for access instructions.
:* Globus Collection: '''[https://app.globus.org/file-manager?origin_id=34efa512-0991-11ec-ae2a-6b098c7117de Hopper Cluster Hosted at CAC]'''
 
  
 
=Scheduler/Queues=
 
=Scheduler/Queues=

Latest revision as of 11:35, 30 August 2022

This is a private cluster.

Hardware

  • Head node: hopper.cac.cornell.edu.
  • Access modes: ssh
  • OpenHPC 2.3 with Rocky Linux 8.4
  • 22 compute nodes (c0001-c0022) with dual 20-core Intel Xeon Gold 5218R CPUs @ 2.1 GHz, 192 GB of RAM
  • Hyperthreading is enabled on all nodes, i.e., each physical core is considered to consist of two logical CPUs
  • Interconnect is InfiniBand EDR from each node to the HDR switch - each HDR port splits to 2 EDR ports
  • Submit HELP requests: help OR by sending an email to CAC support please include Hopper in the subject area.

File Systems

Home Directories

  • Path: ~

User home directories is located on a NFS export from the head node. Use your home directory (~) for archiving the data you wish to keep. Data in user's home directories are NOT backed up.

Access via Globus

Scheduler/Queues

  • The cluster scheduler is Slurm. All nodes are configured to be in the "normal" partition with no time limits. See Slurm documentation page for details.
  • Remember, hyperthreading is enabled on the cluster, so Slurm considers each physical core to consist of two logical CPUs.
  • Partitions (queues):
Name Description Time Limit
normal all nodes no limit

Software

Work with Environment Modules

Set up the working environment for each package using the module command. The module command will activate dependent modules if there are any.

To show currently loaded modules: (These modules are loaded by default system configurations)

-bash-4.2$ module list

Currently Loaded Modules:
  1) autotools   3) gnu9/9.3.0   5) libfabric/1.12.1   7) ohpc
  2) prun/2.1    4) ucx/1.9.0    6) openmpi4/4.0.5

To show all available modules (as of August 5, 2021):

-bash-4.2$ module avail

-------------------- /opt/ohpc/pub/moduledeps/gnu9-openmpi4 --------------------
   adios/1.13.1        netcdf-fortran/4.5.2    py3-mpi4py/3.0.3
   boost/1.75.0        netcdf/4.7.3            py3-scipy/1.5.1
   fftw/3.3.8          opencoarrays/2.9.2      scalapack/2.1.0
   hypre/2.18.1        petsc/3.14.4            slepc/3.14.2
   mfem/4.2            phdf5/1.10.6            superlu_dist/6.1.1
   mumps/5.2.1         pnetcdf/1.12.1          trilinos/13.0.0
   netcdf-cxx/4.3.1    ptscotch/6.0.6

------------------------ /opt/ohpc/pub/moduledeps/gnu9 -------------------------
   R/4.1.0        impi/2021.3.0          mvapich2/2.3.4          superlu/5.2.1
   gdal/3.3.1     impi/2021.3.1   (D)    openblas/0.3.7
   gsl/2.6        metis/5.1.0            openmpi4/4.0.5   (L)
   hdf5/1.10.6    mpich/3.3.2-ofi        py3-numpy/1.19.0

-------------------------- /opt/ohpc/pub/modulefiles ---------------------------
   GMAT/R2020a                julia/1.6.2             proj/8.1.0
   autotools           (L)    libfabric/1.12.1 (L)    prun/2.1        (L)
   cmake/3.19.4               octave/6.3.0            ucx/1.9.0       (L)
   gnu9/9.3.0          (L)    ohpc             (L)    valgrind/3.16.1
   intel/2021.3.0.3350        os                      visit/3.2.1

  Where:
   D:  Default Module
   L:  Module is loaded

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

To load a module and verify:

-bash-4.2$ module load R/4.1.0 
-bash-4.2$ module list

Currently Loaded Modules:
  1) autotools    4) ucx/1.9.0          7) ohpc
  2) prun/2.1     5) libfabric/1.12.1   8) openblas/0.3.7
  3) gnu9/9.3.0   6) openmpi4/4.0.5     9) R/4.1.0

To unload a module and verify:

-bash-4.2$ module unload R
-bash-4.2$ module list

Currently Loaded Modules:
  1) autotools   3) gnu9/9.3.0   5) libfabric/1.12.1   7) ohpc
  2) prun/2.1    4) ucx/1.9.0    6) openmpi4/4.0.5


Install R Packages in Home Directory

If you need a new R package not installed on the system, you can install R packages in your home directory using these instructions.

Manage Modules in Your Python Virtual Environment

python3 (3.6) is installed. Users can manage their own python environment (including installing needed modules) using virtual environments. Please see the documentation on virtual environments on python.org for details.

Create Virtual Environment

You can create as many virtual environments, each in their own directory, as needed.

  • python3: python3 -m venv <your virtual environment directory>

Activate Virtual Environment

You need to activate a virtual environment before using it:

source <your virtual environment directory>/bin/activate

Install Python Modules Using pip

After activating your virtual environment, you can now install python modules for the activated environment:

  • It's always a good idea to update pip first:
pip install --upgrade pip
  • Install the module:
pip install <module name>
  • List installed python modules in the environment:
pip list modules
  • Examples: Install tensorflow and keras like this:
-bash-4.2$ python3 -m venv tensorflow
-bash-4.2$ source tensorflow/bin/activate
(tensorflow) -bash-4.2$ pip install --upgrade pip
Collecting pip
  Using cached https://files.pythonhosted.org/packages/30/db/9e38760b32e3e7f40cce46dd5fb107b8c73840df38f0046d8e6514e675a1/pip-19.2.3-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 18.1
    Uninstalling pip-18.1:
      Successfully uninstalled pip-18.1
Successfully installed pip-19.2.3
(tensorflow) -bash-4.2$ pip install tensorflow keras
Collecting tensorflow
  Using cached https://files.pythonhosted.org/packages/de/f0/96fb2e0412ae9692dbf400e5b04432885f677ad6241c088ccc5fe7724d69/tensorflow-1.14.0-cp36-cp36m-manylinux1_x86_64.whl
:
:
:
Successfully installed absl-py-0.8.0 astor-0.8.0 gast-0.2.2 google-pasta-0.1.7 grpcio-1.23.0 h5py-2.9.0 keras-2.2.5 keras-applications-1.0.8  [...]
(tensorflow) -bash-4.2$ pip list modules
Package              Version
-------------------- -------
absl-py              0.8.0  
astor                0.8.0  
gast                 0.2.2  
google-pasta         0.1.7  
grpcio               1.23.0 
h5py                 2.9.0  
Keras                2.2.5  
Keras-Applications   1.0.8  
Keras-Preprocessing  1.1.0  
Markdown             3.1.1  
numpy                1.17.1 
pip                  19.2.3 
protobuf             3.9.1  
PyYAML               5.1.2  
scipy                1.3.1  
setuptools           40.6.2 
six                  1.12.0 
tensorboard          1.14.0 
tensorflow           1.14.0 
tensorflow-estimator 1.14.0 
termcolor            1.1.0  
Werkzeug             0.15.5 
wheel                0.33.6 
wrapt                1.11.2 

Run MPI-Enabled Python in a Singularity Container

The following Dockerfile should create an Ubuntu image that is able to run Python applications parallelized with mpi4py. Note that simply doing apt-get install -y openmpi in Ubuntu 18.04 will not generally install an Open MPI version that is recent enough to be compatible with the host's Open MPI version.

### start with ubuntu base image
FROM ubuntu:18.04

### install basics, python3, and modules needed for application
RUN apt-get update && apt-get upgrade -y && apt-get install -y build-essential zlib1g-dev libjpeg-dev python3-pip openssh-server
RUN pip3 install Pillow numpy pandas matplotlib cython

### install Open MPI version 4.0.5, consistent with Hopper & TheCube
RUN wget 'https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.5.tar.gz' -O openmpi-4.0.5.tar.gz
RUN tar -xzf openmpi-4.0.5.tar.gz openmpi-4.0.5; cd openmpi-4.0.5; ./configure --prefix=/usr/local; make all install
RUN ldconfig

### install mpi4py now that openmpi is installed
RUN pip3 install mpi4py

### add all code from current directory into “code” directory within container, and set as working directory
ADD .  /code
WORKDIR /code
ENV PATH "/code:$PATH"

### compile cython for this particular application
RUN python3 setup.py build_ext --inplace

### set python file as executable so it can be run by docker/singularity
RUN chmod +rx /code/run_reservoir_sim.py

### change username from root
RUN useradd -u 8877 <my_username>
USER <my_username>

The resulting image can then be run in a Singularity container by putting commands like these into your Slurm batch file:

module load singularity
mpirun singularity run cython_reservoir_0.1.sif run_reservoir_sim.py

Software List

Software Path Notes
Intel oneAPI
/opt/intel/oneapi/
  • module swap gnu9 intel; module swap openmpi4 impi
  • includes the HPC Toolkit with Intel MPI and Intel classic compilers (icc, ifort)
GCC 9.3
/opt/ohpc/pub/compiler/gcc/9.3.0/
  • module load gnu9/9.3.0 (Loaded by default)
Open MPI 4.0.5
/opt/ohpc/pub/mpi/openmpi4-gnu9/4.0.5
  • module load openmpi4/4.0.5 (Loaded by default)
Boost 1.75.0
/opt/ohpc/pub/libs/gnu9/openmpi4/boost/1.75.0
  • module load boost/1.75.0
CMake 3.19.4
/opt/ohpc/pub/utils/cmake/3.19.4
  • module load cmake/3.19.4
HDF5 1.10.6
/opt/ohpc/pub/libs/gnu9/hdf5/1.10.6
  • module load hdf5/1.10.6
Octave 6.3.0
/opt/ohpc/pub/apps/octave/6.3.0
  • module load octave/6.3.0
NetCDF 4.7.3
/opt/ohpc/pub/libs/gnu9/openmpi3/netcdf/4.7.3
  • module load netcdf/4.7.3
FFTW 3.3.8
/opt/ohpc/pub/libs/gnu9/openmpi3/fftw/3.3.8
  • module load fftw/3.3.8
Valgrind 3.16.1
/opt/ohpc/pub/utils/valgrind/3.16.1
  • module load valgrind/3.16.1
VisIt 3.2.1
/opt/ohpc/pub/apps/visit/3.2.1
  • module load visit/3.2.1
R 4.1.0
/opt/ohpc/pub/libs/gnu9/R/4.1.0
  • module load R/4.1.0
OpenBLAS 0.3.7
/opt/ohpc/pub/libs/gnu9/openblas/0.3.7
  • module load openblas/0.3.7
Julia 1.6.2
/opt/ohpc/pub/compiler/julia/1.6.2
  • module load julia/1.6.2
GMAT R2020a
/opt/ohpc/pub/apps/GMAT/R2020a
  • module load GMAT/R2020a
Matlab R2021a
/opt/ohpc/pub/apps/matlab/R2021a
  • module load matlab/R2021a
gurobi 9.5.1
/opt/ohpc/pub/apps/gurobi/9.5.1
  • module load gurobi/9.5.1
cdo 2.0.0
/opt/ohpc/pub/apps/cdo/2.0.0
  • module load cdo/2.0.0
singularity 3.7.1
/opt/ohpc/pub/libs/singularity/3.7.1
  • module load singularity/3.7.1

Help

  • Submit questions or requests at help or by sending email to: help@cac.cornell.edu. Please include Hopper in the subject area.