HD Human Neuroscience Institute (HD-HNI) Computing

From CAC Documentation wiki
Revision as of 12:39, 5 March 2018 by Rda1 (talk | contribs)
Jump to navigation Jump to search

Welcome to the User Documentation for the HD Human Neuroscience Institute Computing Environment

IMPORTANT UPDATE - March 5, 2018

All data has been transitioned to new storage. See instructions below for remote access.


The HD-HNI computing environment consists of:

  1 Head Node - hd-hni.cac.cornell.edu - Dell R620 server 
  1 Batch Node - compute-1-1.hd-hni - Dell R820 - 32 cores and 256GB RAM
  1 Interactive Node - hd-hni-interactive-1-1.cac.cornell.edu - Dell R820 - 32 cores and 256GB RAM
    alias: inter1.cac.cornell.edu  (easier to type than the full name!)
  1 Cluster file server - hd-hni-fs.cac.cornell.edu - serving 22.5TB of storage capacity
  1 Force10 switch -  connects everything together with 10GB ethernet!

Installed Software

(sortable table)
Package and Version Location hd-hni.cac.cornell.edu Compute Node Interactive Node Notes
Python 2.6.6 /usr/bin/python yes yes yes
Python 2.7.6 /usr/local/bin/python (points to /opt/python) yes yes yes to load env variables, type: module load python-2.7.6
PyMVPA2 /usr/bin yes yes yes
Matlab R2012b /opt/MATLAB/R2012b yes no no
Matlab R2012b Runtimes /opt/MATLAB/MATLAB_Compiler_Runtime/v80 yes yes yes
Matlab R2012b Toolbox: plsgui /opt/MATLAB/R2012b/toolbox yes no no
Matlab R2012b Toolbox: spm5, spm8 spm12 /opt/MATLAB/R2012b/toolbox yes yes yes
Matlab R2013a (default) /opt/MATLAB/R2013a yes no yes
Matlab R2013a Runtimes /opt/MATLAB/MATLAB_Compiler_Runtime/v81 yes yes yes
Matlab R2013a Toolbox: gift /opt/matlab/R2013a/toolbox/GroupICATv3.0a/icatb yes no yes
Matlab R2013a Toolbox: mvpa /opt/MATLAB/R2013a/toolbox yes no yes Princeton
Matlab R2013a Toolbox: plsgui /opt/MATLAB/R2013a/toolbox yes no yes
Matlab R2013a Toolbox: spm5, spm8, spm12 /opt/MATLAB/R2013a/toolbox yes yes yes
Matlab R2013a Toolbox: vbm8 /opt/MATLAB/R2013a/toolbox/spm8/toolbox yes no Yes
Matlab R2013a Toolbox: xjview /opt/MATLAB/R2013a/toolbox/spm8/toolbox yes no Yes
Matlab R2014a /opt/MATLAB/R2014a no no yes
Matlab R2014a Runtimes /opt/MATLAB/MATLAB_Compiler_Runtime/v83 yes yes yes
Matlab R2014a Toolbox: spm5, spm8, spm12 /opt/MATLAB/R2013a/toolbox yes yes yes
Matlab R2016a /opt/MATLAB/R2016a yes no no
Matlab R2016a Runtimes /opt/MATLAB//MATLAB_Compiler_Runtime/v901 yes yes yes
Matlab R2016a Toolbox: spm5, spm8, spm12 /opt/MATLAB/R2013a/toolbox yes yes yes
fsl /opt/fsl yes yes yes
freesurfer /opt/freesurfer yes yes yes setup environment, type: source $FREESURFER_HOME/SetUpFreeSurfer.sh
afni /opt/abin yes yes yes
meica /opt/meica yes yes yes
R 3.2.2 /usr/bin/R yes yes yes
xnat http://hd-hni-xnat.cac.cornell.edu/
  • It is usually possible to install software in your home directory; you can also request through your helpdesk to have the CAC install software, pending the approval from the PI.
  • You can view a list of installed software with: 'rpm -qa'. Use the grep option to search for specific software: rpm -qa | grep sw_name [i.e. rpm -qa | grep perl ]

Who can access the HD-HNI computing environment?

  • Users should request access to the HD-HNI project by contacting Allison Hermann.
  • When added, users will receive a welcome letter with their username and initial CAC HD-HNI password.

What servers can users access and how?

HD-HNI users can directly access two nodes: the cluster head node and the interactive node.

First Login

The first thing to do is login to the cluster head node and change your password.
This will require an ssh client.
How to Connect to Linux server (head node)?
If you do not have a ssh client, contact your local helpdesk.

Managing Your Password

After you change your password, you will be prompted for an ssh pass phrase; leave this blank, simply hit the Enter key.
You are now ready to begin using the HD-HNI computing environment.

HD-HNI File Server - Moving data from your workstation to the shared file server

A central file server, hd-hni-fs.cac.cornell.edu, serves all HD-HNI user home directories. Users can not ssh to this server but can connect and access files in a variety of other ways outlined below.

Note: by default, your home directory and its contents will be readable and executable by all other users of the HD-HNI systems. If this is not what you want, you can change the permissions of the home directory and its files and subdirectories via the standard Linux or Windows mechanisms. However, be aware that this may lead to conflicts for cross-platform applications, as Windows and Linux permissions are not 100% compatible.

Attach your home directory to your local computer

You can "mount" your home directory from computers on the Cornell network or if off campus, computers connected to Cornell VPN. Once your home directory is mounted, you can drag and drop files to it.

Windows Users - how to mount HD-HNI directories

NOTE: WE RECOMMEND THAT YOU DO NOT SAVE YOUR PASSWORD ON PUBLIC COMPUTERS, DOING SO MAY CAUSE CONNECTION TROUBLES

  • Open My Computer
  • Click on Tools -> Map Network Drive
  • Drive H: (if you are already using this drive letter, use another letter)
  • Folder: \\hd-hni-fs.cac.cornell.edu\<userid>
  • or
  • Folder: \\hd-hni-fs.cac.cornell.edu\reyna-fmri
  • Then:
    • Select "Connect using a different user name:". This will allow you to enter the CAC domain and your HD-HNI userid at CAC, rather than those associated with your own machine.
    • User name: CTC_ITH\your_userid
    • Password: your CAC HD-HNI password
  • Troubleshooting: If you have already mapped the drive and subsequently have problems, disconnect the drive and remap it.

MacOS X Users - how to mount HD-HNI directories

NOTE: WE RECOMMEND THAT YOU DO NOT SAVE YOUR PASSWORD ON PUBLIC COMPUTERS, DOING SO MAY CAUSE CONNECTION TROUBLES

  1. In the Finder, select Connect to Server... from the Go menu.

    FileAccess1.jpg
  2. Enter smb://hd-hni-fs.cac.cornell.edu/<user name> in the Server Address field. You may need to use smb://<username>@hd-hni-fs.cac.cornell.edu/<username>.
  3. Enter your CAC user name and password to log in.

Linux Users - how to mount HD-HNI directories

You cannot mount the HD-HNI directories vis NFS for security reasons. To mount it as a CIFS drive, you need to be root, which often means using the sudo command. Then execute

 mount -t cifs //hd-hni-fs.cac.cornell.edu/<user name> /mnt/test -o user=<user name>,workgroup=ctc_ith,vers=3.0

where <username> is your username, and /mount/point is the name of a directory you have already created on your local filesystem. Enter the password for your CAC project when prompted. See man mount.cifs for available options for the mount command

If you see errors, such as "missing codepage or helper program," then you have not installed the mount and umount packages for CIFS on your local machine. If problems persist, send your initial command and the results of dmesg | tail.

File Transfers To The HD-HNI file server

Users can transfer data to the HD-HNI-FS file server without mounting the file systems locally.

File Transfers between File Server and your Linux or MAC workstation

Secure Copy

Secure copy is a standard tool to copy files to and from remote hosts.

This example copies a file named "localfile.dat" on your local workstation to a file named remoteinput.dat on the remote file server:
localhost$ scp localfile.dat username@hd-hni-fs.cac.cornell.edu:remoteinput.dat

This example copies the file named results.dat on the remote server to a file named localresults.dat on your local workstation.
localhost$ scp username@hd-hni-fs.cac.cornell.edu:results.dat localresults.dat 

Secure FTP

FTP is disabled for security reasons, but sftp's interface is nearly identical.

localhost$sftp username@hd-hni-fs.cac.cornell.edu
    <enter your username's password when prompted>
sftp> put localresults.dat results.dat
sftp> quit

Samba Client

Type

smbclient //hd-hni-fs.cac.cornell.edu/<user name> -U <user name> -W ctc_ith -m SMB3

Enter the password for your HD-HNI account when prompted You will see the smb:\> prompt. You can now start transferring files between your local machine and your HD-HNI home directory similar to ftp client. Type help for more instructions.

-sh-3.2$ smbclient //hd-hni-fs.cac.cornell.edu/<user name> -U <user name> -W ctc_ith -m SMB3
    Password: 
    Domain=[CTC_ITH] OS=[Unix] Server=[Samba 3.0.28-1.el5_2.1]
    smb: \> help

File Transfers between the File Server and your Windows Workstation

Secure Copy

The people who make Putty provide a secure copy client called pscp.

 This example copies local workstation file to the remote file server:
  cmd> pscp localfile.dat username@hd-hni-fs.cac.cornell.edu:remoteinput.dat
     <enter your username's password when prompted>

 This example copies a file on the remote file server to the local workstation:
  cmd> pscp username@hd-hni-fs.cac.cornell.edu:results.dat localresults.dat

Secure FTP

FTP is disabled for security reasons, but psftp's interface is nearly identical. From the command prompt, type:

cmd> psftp username@hd-hni-fs.cac.cornell.edu
    <enter your username's password when prompted>
psftp> put localresults.dat results.dat
psftp> quit

HD-HNI Data Processing Environment

There is one batch node (compute-1-1) and one interactive node (hd-hni-interactive-1-1) for processing HD-HNI data.

How to access the data processing nodes?

Submit jobs to the batch node from the head node or from the interactive node. ssh directly to the head node and the interactive node. ssh to the batch node from the head node only.

Maui Scheduler and Job submission/monitoring commands

Jobs are scheduled by the Maui scheduler with the Torque resource manager. We suggest you use a job submission batch file utilizing PBS Directives ('Options' section).

Common Maui Commands

(If you have any experience with PBS/Torque or SGE, Maui Commands may be recognizable. Most used:

qsub - Job submission (jobid will be displayed for the job submitted)

  • $ qsub jobscript.sh

showq - Display queue information.

  • $ showq (dump everything)
  • $ showq -r (show running jobs)
  • $ showq -u foo42 (shows foo42's jobs)

checkjob - Display job information. (You can only checkjob your own jobs.)

  • $ checkjob -A jobid (get dense key-value pair information on job 42)
  • $ checkjob -v jobid (get verbose information on job 42)

canceljob - Cancel Job. (You can only cancel your own jobs.)

  • $ canceljob jobid


Batch Processing

Submit jobs to the batch node via the cluster head node or the interactive node. Jobs are scheduled as resources are available.

  • First ssh to the Head node or the Interactive node
ssh username@hd-hni.cac.cornell.edu
or
ssh username@inter1.cac.cornell.edu
  • Next, using your favorite editor create a job file named test_default_queue.sh (text file of commands) containing the following lines:
#!/bin/bash
#PBS -l walltime=00:05:00,nodes=1
#PBS -j oe
#PBS -N testdefaultqueue 
#PBS -q default

# Turn on echo of shell commands
set -x
# Because jobs start in the HOME directory, move to submit dir 
cd $PBS_O_WORKDIR
echo 'pwd'
echo "PBS_O_WORKDIR is `pwd`"
echo "env is `env`"
# copy your binary that you want to run and any data files to a local directory on node job is executing on
# this example assumes you have a binary file named helloworld.sh in your local bin directory
cp $HOME/bin/helloworld.sh $TMPDIR
cd $TMPDIR
# run the binary file from the local disk on the node the job was placed on
./helloworld.sh >&hello.stdout
# Copy output files to your output folder       
cp -f $TMPDIR/hello.stdout $HOME/output
  • Submit the test job:
qsub test_default_queue.sh
  • Check the status of the job:
showq
  • Once the job runs the output file will be created with the name you gave it (-N above) with an appended ".o[jobid]".
To view the output:
cat testdefaultqueue.o[jobid]

Matlab - How to?

The full MATLAB application is installed on the head node and the interactive node. You should use the Batch node (compute-1-1) for production runs of compiled MATLAB codes. Compiled code need the MATLAB runtimes; these are installed on all nodes. For the full application:

  • MATLAB R2013a is the default. MATLAB R2012b and R2014a are also available.
    • The head node has releases R2013a and R2012b installed.
    • The interactive node has releases R2013a and R2014a installed.
  • Use the module load and module unload commands to switch between MATLAB versions on the head node or the interactive node.
    • To view what modules are available: module avail
    • To load matlab R2012b and add the executables to your PATH: module load matlab/R2012b
    • After loading the module, just type to run R2012b: matlab
    • To unload/remove R2012b from your PATH: module unload matlab/R2012b
    • You can do the same for matlab/R2013a. [R2013a is the default if you login and just run matlab.]
    • To force the addition of R2013a to your PATH: module load matlab/R2013a

How to Compile Matlab code?

Assume we have a matlab file named matlabruntimetest.m:

# cat matlabruntimetest.m
 for i=1:10,
 fprintf('%d\n',i);
 end

We use mcc to compile the .m into a binary executable file. You can do this from within the MATLAB application, or from the command line after the correct module is loaded:

mcc -m matlabruntimetest.m

This will translate your .m file into the C language and compile it using a C compiler, so that you end up with an executable named matlabruntimetest. It will produce several other files in the process. In Linux, one of these files should be a shell script named run_matlabruntimetest.sh, which you will need to run the executable. (There should also be a readme.txt with lots and lots of helpful information—too much, really, more than you want to know—which is available as a reference.)

Some useful extra options may be added to mcc:

mcc -R -nodisplay -R -singleCompThread -m matlabruntimetest.m

The first one says that the executable will not be displaying anything, and the second one says to disable multithreading. It is not clear that the latter option is absolutely necessary in batch, but it may help to keep jobs from interfering with each other when running on the same batch node.

How to run compiled Matlab code on the Batch node?

Create a job script similar to the following. Be sure your path-to-runtimes matches the version you compiled with. For R2012b, the path needs to be /opt/matlab/MATLAB_Compiler_Runtime/v80, while for R2014a it should be /opt/matlab/MATLAB_Compiler_Runtime/v83. R2013a is shown in the example below. For your convenience, the runtimes for R2012b, R2013a, and R2014a have been installed on all nodes. (This includes the login and interactive nodes - but please only do very brief runs there, for testing purposes.)

$ cat testmatlab.sh

 #!/bin/bash
 #PBS -l walltime=00:02:00,nodes=1
 #PBS -j oe
 #PBS -N testmatlab
 #PBS -q test

 # Turn on echo of shell commands
 set -x
 # Because jobs start in the HOME directory, move to where we submitted.
 cd $PBS_O_WORKDIR
 echo 'pwd'
 echo "PBS_O_WORKDIR is `pwd`"
 echo "env is `env`"
 # copy the binary and data files to a local directory on node job is executing on
 cp $HOME/bin/run_matlabruntimetest.sh $TMPDIR
 cp $HOME/bin/matlabruntimetest $TMPDIR
 # run your executable from the local disk on the node the job was placed on
 # the command line arguments are: matlab_script path-to-runtimes outputfile
 cd $TMPDIR
 ./run_matlabruntimetest.sh /opt/matlab/MATLAB_Compiler_Runtime/v81 >&matlab.stdout 
 # Copy output files to your output folder       
 cp -f matlab.stdout $HOME
 sleep 10

 # end script; the sleep is only there for the purpose of you confirming 
 # you can see if it is running w/ 'showq' or 'checkjob'.

Now submit your script to the scheduler, taking note of the id# or job number that is returned:

qsub testmatlab.sh

To view your job's progress:

showq
showq -u myusername
checkjob id#

After the job runs, the output file specified by the script, matlab.stdout, will appear in your home directory.

Matlab code Tips

Here a few tips to know for creating a MATLAB code that is suitable for mcc:

  • If your .m file is a function that takes one or more arguments, put these arguments on the command line after the path to the runtimes. But remember that the arguments will be input as strings, just like any arguments (argv) to a C program. Therefore your .m file should use str2num or str2double to convert any input arguments to the expected type.
  • Dump figures into files using "saveas" or "print"; or, write a separate postprocessing script that reads data files written by multiple tasks and produces the final plots.

Some minor gotchas...

  • Don't nest curly braces within comment blocks, e.g., %{ comment1 %{ comment2 %} %} - the braces may be misinterpreted as clause delimiters in C, due to an mcc bug.
  • Commands from the Symbolic Math Toolbox may not be recognized, e.g., heaviside().

It is possible to get much more elaborate with MATLAB's mcc system by, for example, using "mcc -l" to create a function library, or perhaps by implementing parallelism, but the above should be enough to get you started. For more information, see the MATLAB help as well as "mcc -?".

Interactive Processing

  • Users can login to the interactive node and run software.
  • Users can submit jobs to the batch node from the interactive node or the cluster head node.
  • There is no scheduling on the interactive node; users will compete for resources.
  • For help with interactive sessions on Linux severs see instructions at Connect to Linux

XNAT

  1. Select Login via LDAP
  2. Enter you CAC user name and password
  3. After authentication is successful, you will receive a separate email that your account has been approved.

Monitoring the Cluster

HD-HNI Cluster Monitor

NEED HELP?

Password resets can be requested at: password reset
Cluster technical questions should be addressed to: CAC HelpDesk
Human Ecology related questions should be addressed to: Human Ecology HelpDesk