Difference between revisions of "WALLE Cluster"
(→Software List: update MKL as separate row) |
|||
(13 intermediate revisions by 2 users not shown) | |||
Line 3: | Line 3: | ||
= General Information = | = General Information = | ||
− | + | * Walle is a private virtual cluster in Red Cloud with restricted access to the following groups: ek436_0001. Please see [[Virtual_Cluster_in_Red_Cloud | Virtual Cluster in Red Cloud page]] for more usage information. | |
− | + | * Head node: '''walle.cac.cornell.edu''' ([[#How To Login|access via ssh]]) | |
− | + | ** [https://openhpc.community/ OpenHPC] deployment running CentOS 8 | |
− | + | ** Cluster scheduler: slurm (See [[Slurm | CAC slurm documentation ]] for more info) | |
− | + | * compute nodes - on demand via slurm | |
− | + | * data on the walle cluster storage is <tt>'''NOT'''</tt> backed up | |
− | + | * Please send any questions and report problems to: [mailto:cac-help@cornell.edu cac-help@cornell.edu] | |
− | |||
= Compute Nodes = | = Compute Nodes = | ||
+ | |||
+ | Slurm will create the required Red Cloud instance to run the jobs submitted to the partition. If no partition is specified in the <code>sbatch</code> or <code>srun</code> command, the <code>c2</code> partition is the default. | ||
{| class="wikitable" | {| class="wikitable" | ||
Line 51: | Line 52: | ||
-bash-4.2$ module avail | -bash-4.2$ module avail | ||
− | + | ------------------- /opt/ohpc/pub/moduledeps/gnu9-openmpi4 -------------------- | |
− | adios/1.13.1 | + | adios/1.13.1 netcdf-fortran/4.5.2 py3-mpi4py/3.0.3 |
− | boost/1. | + | boost/1.75.0 netcdf/4.7.3 py3-scipy/1.5.1 |
− | fftw/3.3.8 | + | fftw/3.3.8 (L) opencoarrays/2.9.2 scalapack/2.1.0 |
− | hypre/2.18.1 | + | hypre/2.18.1 petsc/3.14.4 slepc/3.14.2 |
− | mfem/4. | + | jdftx/1.6.0 (L) petsc/3.15.0 (D) slepc/3.15.0 (D) |
− | mumps/5.2.1 | + | mfem/4.2 phdf5/1.10.6 superlu_dist/6.1.1 |
− | netcdf-cxx/4.3.1 | + | mumps/5.2.1 pnetcdf/1.12.1 trilinos/13.0.0 |
+ | netcdf-cxx/4.3.1 ptscotch/6.0.6 | ||
− | ------------------------ /opt/ohpc/pub/moduledeps/gnu9 | + | ------------------------ /opt/ohpc/pub/moduledeps/gnu9 ------------------------ |
− | gsl/2.6 | + | gsl/2.6 (L) mpich/3.3.2-ofi py3-numpy/1.19.0 |
− | hdf5/1.10.6 mvapich2/2.3. | + | hdf5/1.10.6 (L) mvapich2/2.3.4 superlu/5.2.1 |
− | + | impi/2021.2.0 openblas/0.3.7 (L) | |
+ | metis/5.1.0 openmpi4/4.0.5 (L) | ||
− | -------------------------- /opt/ohpc/pub/modulefiles | + | -------------------------- /opt/ohpc/pub/modulefiles -------------------------- |
− | autotools | + | autotools (L) julia/1.6.1 os |
− | + | cmake/3.19.4 libfabric/1.11.2 (L) prun/2.1 (L) | |
− | gnu9/9.3.0 | + | cuda/11.3 mkl/2021.2.0.610 (L) py3-libs |
+ | gnu9/9.3.0 (L) octave/6.3.0 (L) ucx/1.9.0 (L) | ||
+ | intel/2021.2.0.610 ohpc (L) | ||
Where: | Where: | ||
+ | D: Default Module | ||
L: Module is loaded | L: Module is loaded | ||
Line 167: | Line 173: | ||
wrapt 1.11.2 | wrapt 1.11.2 | ||
</pre> | </pre> | ||
+ | |||
+ | == Software List == | ||
+ | ::{| border="1" cellspacing="0" cellpadding="10" | ||
+ | ! style="width: 33%"| Software | ||
+ | ! style="width: 33%"| Path | ||
+ | ! style="width: 34%"|Notes | ||
+ | |- | ||
+ | | *GNU Compilers 9.3.0 | ||
+ | | /opt/ohpc/pub/compiler/gcc/9.3.0 | ||
+ | | module load gnu9/9.3.0 | ||
+ | |- | ||
+ | | Intel Compilers (2021 Update 2) | ||
+ | | /opt/intel/oneapi/compiler/2021.2.0 | ||
+ | | module load intel/2021.2.0.610 | ||
+ | |- | ||
+ | | MKL 2021.2.0.610 (2021 Update 2) | ||
+ | | /opt/intel/oneapi/mkl/2021.2.0 | ||
+ | | module load mkl/2021.2.0.610 | ||
+ | |- | ||
+ | | *openmpi 4.0.5 | ||
+ | | | ||
+ | * /opt/ohpc/pub/mpi/openmpi4-gnu9 | ||
+ | * /opt/ohpc/pub/mpi/openmpi4-intel | ||
+ | | module load openmpi4 | ||
+ | |- | ||
+ | | Intel MPI 2021.2.0 | ||
+ | | /opt/intel/oneapi/mpi/2021.2.0 | ||
+ | | module load impi/2021.2.0 | ||
+ | |- | ||
+ | | Julia 1.6.1 | ||
+ | | /opt/ohpc/pub/compiler/julia/1.6.1 | ||
+ | | module load julia/1.6.1 | ||
+ | |- | ||
+ | | JDFTx 1.6.0 | ||
+ | | /opt/ohpc/pub/apps/jdftx/1.6.0 | ||
+ | | module load jdftx/1.6.0 | ||
+ | |- | ||
+ | | CUDA 11.3 | ||
+ | | /usr/local/cuda-11.3 | ||
+ | | module load cuda/11.3 | ||
+ | |- | ||
+ | | petsc / petsc4py 3.15.0 | ||
+ | | | ||
+ | * /opt/ohpc/pub/libs/gnu9/openmpi4/petsc/3.15.0 | ||
+ | * /opt/ohpc/pub/libs/intel/impi/petsc/3.15.0 | ||
+ | | module load petsc/3.15.0 | ||
+ | |- | ||
+ | | slepc / slepc4py 3.15.0 | ||
+ | | | ||
+ | * /opt/ohpc/pub/libs/gnu9/openmpi4/slepc/3.15.0 | ||
+ | * /opt/ohpc/pub/libs/intel/impi/slepc/3.15.0 | ||
+ | | module load slepc/3.15.0 | ||
+ | |- | ||
+ | | Python3 Modules | ||
+ | (pytorch with CUDA support, numpy, scipy, matplotlib, pandas tensorflow, keras, sklearn, umap) | ||
+ | | /opt/ohpc/pub/utils/py3-libs/ | ||
+ | | module load py3-libs | ||
+ | |} |
Latest revision as of 13:57, 4 November 2021
General Information
- Walle is a private virtual cluster in Red Cloud with restricted access to the following groups: ek436_0001. Please see Virtual Cluster in Red Cloud page for more usage information.
- Head node: walle.cac.cornell.edu (access via ssh)
- OpenHPC deployment running CentOS 8
- Cluster scheduler: slurm (See CAC slurm documentation for more info)
- compute nodes - on demand via slurm
- data on the walle cluster storage is NOT backed up
- Please send any questions and report problems to: cac-help@cornell.edu
Compute Nodes
Slurm will create the required Red Cloud instance to run the jobs submitted to the partition. If no partition is specified in the sbatch
or srun
command, the c2
partition is the default.
Slurm Partition | Red Cloud Instance Flavor | CPU | RAM (GB) | GPU |
---|---|---|---|---|
c2 | c2.m16 | 2 | 16 | None |
c20 | c20.m160 | 20 | 160 | None |
c28 | c.28.m224 | 28 | 224 | None |
t1 | c4.t1.m20 | 4 | 20 | Nvidia T4 |
g1 | c14.g1.m60 | 14 | 60 | Nvidia V100 |
Software
Work with Environment Modules
Set up the working environment for each package using the module command. The module command will activate dependent modules if there are any.
To show currently loaded modules: (These modules are loaded by default system configurations)
-bash-4.2$ module list Currently Loaded Modules: 1) autotools 3) gnu9/9.3.0 5) libfabric/1.10.1 7) ohpc 2) prun/2.0 4) ucx/1.8.0 6) openmpi4/4.0.4
To show all available modules:
-bash-4.2$ module avail ------------------- /opt/ohpc/pub/moduledeps/gnu9-openmpi4 -------------------- adios/1.13.1 netcdf-fortran/4.5.2 py3-mpi4py/3.0.3 boost/1.75.0 netcdf/4.7.3 py3-scipy/1.5.1 fftw/3.3.8 (L) opencoarrays/2.9.2 scalapack/2.1.0 hypre/2.18.1 petsc/3.14.4 slepc/3.14.2 jdftx/1.6.0 (L) petsc/3.15.0 (D) slepc/3.15.0 (D) mfem/4.2 phdf5/1.10.6 superlu_dist/6.1.1 mumps/5.2.1 pnetcdf/1.12.1 trilinos/13.0.0 netcdf-cxx/4.3.1 ptscotch/6.0.6 ------------------------ /opt/ohpc/pub/moduledeps/gnu9 ------------------------ gsl/2.6 (L) mpich/3.3.2-ofi py3-numpy/1.19.0 hdf5/1.10.6 (L) mvapich2/2.3.4 superlu/5.2.1 impi/2021.2.0 openblas/0.3.7 (L) metis/5.1.0 openmpi4/4.0.5 (L) -------------------------- /opt/ohpc/pub/modulefiles -------------------------- autotools (L) julia/1.6.1 os cmake/3.19.4 libfabric/1.11.2 (L) prun/2.1 (L) cuda/11.3 mkl/2021.2.0.610 (L) py3-libs gnu9/9.3.0 (L) octave/6.3.0 (L) ucx/1.9.0 (L) intel/2021.2.0.610 ohpc (L) Where: D: Default Module L: Module is loaded Use "module spider" to find all possible modules and extensions. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
To load a module and verify:
-bash-4.2$ module load cmake -bash-4.2$ module list Currently Loaded Modules: 1) autotools 3) gnu9/9.3.0 5) libfabric/1.10.1 7) ohpc 2) prun/2.0 4) ucx/1.8.0 6) openmpi4/4.0.4 8) cmake/3.16.2
Manage Modules in Your Python Virtual Environment
python 3.6.8 is installed. Users can manage their own python environment (including installing needed modules) using virtual environments. Please see the documentation on virtual environments on python.org for details.
Create Virtual Environment
You can create as many virtual environments, each in their own directory, as needed.
python3 -m venv <your virtual environment directory>
Activate Virtual Environment
You need to activate a virtual environment before using it:
source <your virtual environment directory>/bin/activate
Install Python Modules Using pip
After activating your virtual environment, you can now install python modules for the activated environment:
- It's always a good idea to update
pip
first:
pip install --upgrade pip
- Install the module:
pip install <module name>
- List installed python modules in the environment:
pip list modules
- Examples: Install
tensorflow
andkeras
like this:
-bash-4.2$ python3 -m venv tensorflow -bash-4.2$ source tensorflow/bin/activate (tensorflow) -bash-4.2$ pip install --upgrade pip Collecting pip Using cached https://files.pythonhosted.org/packages/30/db/9e38760b32e3e7f40cce46dd5fb107b8c73840df38f0046d8e6514e675a1/pip-19.2.3-py2.py3-none-any.whl Installing collected packages: pip Found existing installation: pip 18.1 Uninstalling pip-18.1: Successfully uninstalled pip-18.1 Successfully installed pip-19.2.3 (tensorflow) -bash-4.2$ pip install tensorflow keras Collecting tensorflow Using cached https://files.pythonhosted.org/packages/de/f0/96fb2e0412ae9692dbf400e5b04432885f677ad6241c088ccc5fe7724d69/tensorflow-1.14.0-cp36-cp36m-manylinux1_x86_64.whl : : : Successfully installed absl-py-0.8.0 astor-0.8.0 gast-0.2.2 google-pasta-0.1.7 grpcio-1.23.0 h5py-2.9.0 keras-2.2.5 keras-applications-1.0.8 keras-preprocessing-1.1.0 markdown-3.1.1 numpy-1.17.1 protobuf-3.9.1 pyyaml-5.1.2 scipy-1.3.1 six-1.12.0 tensorboard-1.14.0 tensorflow-1.14.0 tensorflow-estimator-1.14.0 termcolor-1.1.0 werkzeug-0.15.5 wheel-0.33.6 wrapt-1.11.2 (tensorflow) -bash-4.2$ pip list modules Package Version -------------------- ------- absl-py 0.8.0 astor 0.8.0 gast 0.2.2 google-pasta 0.1.7 grpcio 1.23.0 h5py 2.9.0 Keras 2.2.5 Keras-Applications 1.0.8 Keras-Preprocessing 1.1.0 Markdown 3.1.1 numpy 1.17.1 pip 19.2.3 protobuf 3.9.1 PyYAML 5.1.2 scipy 1.3.1 setuptools 40.6.2 six 1.12.0 tensorboard 1.14.0 tensorflow 1.14.0 tensorflow-estimator 1.14.0 termcolor 1.1.0 Werkzeug 0.15.5 wheel 0.33.6 wrapt 1.11.2
Software List
Software Path Notes *GNU Compilers 9.3.0 /opt/ohpc/pub/compiler/gcc/9.3.0 module load gnu9/9.3.0 Intel Compilers (2021 Update 2) /opt/intel/oneapi/compiler/2021.2.0 module load intel/2021.2.0.610 MKL 2021.2.0.610 (2021 Update 2) /opt/intel/oneapi/mkl/2021.2.0 module load mkl/2021.2.0.610 *openmpi 4.0.5 - /opt/ohpc/pub/mpi/openmpi4-gnu9
- /opt/ohpc/pub/mpi/openmpi4-intel
module load openmpi4 Intel MPI 2021.2.0 /opt/intel/oneapi/mpi/2021.2.0 module load impi/2021.2.0 Julia 1.6.1 /opt/ohpc/pub/compiler/julia/1.6.1 module load julia/1.6.1 JDFTx 1.6.0 /opt/ohpc/pub/apps/jdftx/1.6.0 module load jdftx/1.6.0 CUDA 11.3 /usr/local/cuda-11.3 module load cuda/11.3 petsc / petsc4py 3.15.0 - /opt/ohpc/pub/libs/gnu9/openmpi4/petsc/3.15.0
- /opt/ohpc/pub/libs/intel/impi/petsc/3.15.0
module load petsc/3.15.0 slepc / slepc4py 3.15.0 - /opt/ohpc/pub/libs/gnu9/openmpi4/slepc/3.15.0
- /opt/ohpc/pub/libs/intel/impi/slepc/3.15.0
module load slepc/3.15.0 Python3 Modules (pytorch with CUDA support, numpy, scipy, matplotlib, pandas tensorflow, keras, sklearn, umap)
/opt/ohpc/pub/utils/py3-libs/ module load py3-libs