Linux Usage Tips

From CAC Documentation wiki
Revision as of 17:58, 18 September 2019 by Srl6 (talk | contribs)
Jump to navigation Jump to search

Linux shells

  • /bin/sh is the default login shell.
    • Edit $HOME/.profile to change interactive variables.
    • The $HOME/.bashrc file will not be run for non-interactive shells.
  • /bin/bash
    • Edit $HOME/.profile to change interactive variables.
    • The $HOME/.bashrc file will be run for non-interactive shells.
  • /bin/csh and /bin/tcsh
    • Edit $HOME/.login to change interactive variables.
    • The $HOME/.cshrc file will be run for non-interactive shells.

The change shell command, chsh, will not permanently change your shell. You must send a request instead. Contact Support

The default login shell on linuxlogin is sh. Be aware that in CentOS, /bin/sh is a soft-link to /bin/bash, so you are really using a variant of bash. Accordingly, you will find that "man sh" brings up the man page (the help document) for bash. In a way, then, you can think of your login shell as being bash, too.

There are slight differences between sh and bash, however. The "Invocation" section of the man page states: "If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible." Therefore, you will find that ~/.profile is run at login, because this behavior is common to both sh and bash; but any interactive sh shells you start thereafter will not run ~/.bashrc as you might expect from bash. The way to get sh to do this is to "export ENV=~/.bashrc" beforehand (perhaps as part of your .profile).

Let's say you simply prefer to have bash as your default shell and be done with it. There are two ways to accomplish this. First, you can "export SHELL=/bin/bash" in your .profile; then all subsequent interactive shells will truly be bash. Second, you can enter "chsh -s /bin/bash", which forces all login and interactive shells to be bash (because you have changed your default shell). The problem with the second method is it may well wreck your batch environment, too, because the scheduler sets it up under the assumption that the login shell is sh.

The relationship between the csh and tcsh shells is similar to the one between sh and bash. For instance, your csh shells are automatically endowed with the tcsh-style ability to retrieve history through the up- and down-arrow keys. The best way to make tcsh into your everyday working shell is to run it on top of sh after you log in (again, you can do this as part of your .profile).

References

Compiling and linking code on Linux

Use /tmp to compile large codes and software packages. This will provide improved performance and greater system stability.

If you want to know what processor features a cluster supports, submit a batch job that does "cat /proc/cpuinfo" in order to find out the CPU type. The v4 cluster is composed mostly of Intel E5420 CPUs (in Nov. 2011). Then you go to Wikipedia's Intel Xeon page or Intel's ARK to find that these are Harpertown cores that support SSE, SSE2, SSE3, SSSE3, SSE4.1 and VMX.

C/C++ and Fortran Codes
  • GNU compilers gcc, g++, g77, gfortran are in /usr/bin, which is in the default path.
    • For compiling OpenMP directives, add the option -fopenmp.
  • Intel 12.1 compilers icc, ifort are in the default path on the login nodes.
    • For compiling OpenMP directives, add the option -openmp.
    • The following Intel libraries and tools are available to you automatically through the default setup on the login nodes:
      - MKL, the Math Kernel Library 10.3.6 (additional help below)
      - idb, the Intel debugger for Linux
      - TBB, the Threading Building Blocks
      - IPP, the Integrated Performance Primitives
    • If any of the above libraries are linked dynamically, the correct runtimes will be loaded automatically on the compute nodes by default; no additional setup is required.
    • Note - if you find that your code segfaults after compiling with Intel 12.1, try disabling optimization or using the older 11.1 version of the compilers instead.
      Reason: there is a known bug in the vectorizer of the 12.1 compiler which is due to be fixed in a future release.
  • Intel 11.1 compilers icc, ifort are available, also, but these older compilers require special setup files.
    • Before compiling in bash: source /opt/intel/intel-11.sh
    • Before compiling in tcsh: source /opt/intel/intel-11.csh
    • At runtime, in a batch sh-script: source /opt/intel/intel-11.sh
    • At runtime, in a batch csh-script: source /opt/intel/intel-11.csh
    • The above steps also enable the use of the older Intel performance libraries, e.g., MKL 10.2 (additional information below).
  • Help for Intel compilers (if you are using 11.1, be sure to source the setup file first):
    • Fortran: man ifort, info ifort, ifort -help
    • C/C++: man icc, info icc, icc -help
  • Standard compiler options - the clusters have Intel Core2 processors, so standard compiler options are:
    • For Intel: -O3 -ipo -mtune=pentium4 -march=pentium4
  • Other options of possible interest (consult man pages):
    • For Intel: -fno-alias -align -scalar_rep -prefetch
Generating Debugging Info
  • Intel compilers
    • icc -Wall
    • ifort -g -debug -warn -C (-CB for bounds checking only)
MPI Programs

For compiling MPI codes, we recommend using mpicc and mpif90. If you specifically need a C++ compiler, try mpicxx. Because of these handy wrapper scripts, you may not need to do very much to convert existing makefiles to work with CAC's preferred software stack. Currently the default paths are set up so that the mpicc, mpif90 and mpicxx utilities invoke the Intel 12.1 compilers to compile your codes and link them properly to the Intel MPI 3.1 libraries. However, if you run the Intel 11.1 compiler setup file first, then these utilities will automatically use the older 11.1 compiler version. Documentation for the Intel MPI 3.1 Library, including mpdboot and mpiexec, is in PDF on the Intel Support Site.

To view a sample batch script that will run an MPI job for you, see the section on Running a parallel MPI job.

The ROCKS operating system comes with several alternate MPI implementations (e.g., mpich2, OpenMPI). You have to play with environment variables and paths to get them to work.

Intel MKL

Intel's Math Kernel Library (MKL) is a good source of optimized routines for linear algebra, Fast Fourier Transforms, vector math, and other mathematical operations. In particular, it provides a way to incorporate Intel's optimized BLAS and LAPACK routines into your code.

OpenMP multithreading is built into certain MKL libs. When these libs are linked, calls to MKL will detect the same settings that would affect any other OpenMP-enabled code. This means MKL will attempt to use all the cores present on a v4 node during the execution of parallelized sections. Therefore, when you link your code with a "_thread" version of the MKL library, your should realize that all your calls to MKL will generally fork the same number of threads as the number of cores present. This may cause undesired interference with other parallelization strategies you are using, e.g., MPI. If this is not the behavior you want, you can do one of two things:

  • Link with "mkl_sequential" (or -mkl=sequential in 12.1) instead of, e.g., "mkl_intel_thread" (or -mkl=parallel in 12.1).
  • At run time, set the OMP_NUM_THREADS environment variable to 1. (Use "export" or "setenv".) A value of 8 recovers the default behavior on v4.
Linking Intel MKL 10.3.6 with the Intel 12.1 Compilers

MKL 10.3.6 is the version installed with the 12.1 compilers. The easiest way to link MKL is to compile as follows, where the last two lines pertain to MPI codes:

  • icc mycode.c -o mycode -mkl
  • ifort mympicode.c -o mycode -mkl
  • mpicc mympicode.c -o mympicode -mkl
  • mpif90 mympicode.f90 -o mympicode -mkl

Note, -mkl is the same as -mkl=parallel, which enables OpenMPI mulithreading. If you don't want this, use -mkl=sequential.

With just plain -mkl (or -mkl=...), the resulting executable will be dynamically linked. This means that at run time, your program has to know where to find the MKL shared libraries. Since MKL 10.3.6 is the default, the appropriate paths have been predefined for you on the compute nodes, and your batch jobs should have no trouble.

Should you want to link MKL in some different way--e.g., statically--the compile line will start looking messier. Linking to MKL has become rather complicated due to Intel's decision to maximize MKL's flexibility and multi-platform compatibility by splitting out four separate layers of libraries: interface, threading, computational, and runtime (meaning OpenMP, if the _thread lib is requested). To make sure you have all these layers, we recommend appending one of the following snippets to your ifort, icc, mpif90, or mpicc command (after first setting MKLPATH = $MKLROOT/lib/intel64):

  • static, multithreaded:
    $MKLPATH/libmkl_solver_lp64.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a -Wl,--end-group -openmp -lpthread
  • static, sequential:
    $MKLPATH/libmkl_solver_lp64_sequential.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_sequential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

These options will generate a (mostly) statically linked executable. Note, each .a-lib must be identified by its full path in order to prevent the .so-lib (its dynamic equivalent) from being found instead. If you do not need access to the MKL solver routines, simply remove that item from the head of the list. As noted previously, if your main program is itself threaded with OpenMP, or if it is parallelized with MPI, you may want to select libmkl_sequential.a in order to reduce contention and get better performance.

To generate a dynamically linked rather than statically linked executable, the above options become:

  • dynamic, multithreaded:
    $MKLPATH/libmkl_solver_lp64.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -Wl,--end-group -openmp -lpthread
  • dynamic, sequential:
    $MKLPATH/libmkl_solver_lp64_sequential.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group -lpthread

These sets of options are pretty much equivalent to -mkl=parallel and -mkl=sequential, respectively.

If your batch script needs strict control over LD_LIBRARY_PATH, then one other compiler/linker option may be helpful for a dynamically-linked code:

  • -Wl,-rpath,$MKLPATH,-rpath,$IOMPPATH

The above variables should be set to MKLPATH = $MKLROOT/lib/intel64 and IOMPPATH = $MKLROOT/../compiler/lib/intel64. This option "hardwires" the correct paths into the executable; these paths are valid on both the v4 compute nodes and the v4 login nodes. If you don't wish to restrict your executable in this fashion, the alternative is to add these paths manually to LD_LIBRARY_PATH.

Intel has put together a helpful tool for generating the correct linker options to match your specific needs, the Link Line Advisor. This is definitely the place to go if you want to, e.g., use extra-long integers or compile with gcc or gfortran. It's well worth a visit.

Much more information on linking MKL 10.3.6 can be found in the "Linking Your Application" section of the User Guide, which you can access from the login node ("firefox /opt/intel/composer_xe_2011_sp1.6.233/Documentation/en_US/mkl/mkl_userguide/index.htm").

Linking Intel MKL 10.2 with the Intel 11.1 Compilers

MKL 10.2 is the version installed with the 11.1 compilers. Since 11.1 is not the current default version of the compilers, you must first source the setup file:

  • In bash (or sh): source /opt/intel/intel-11.sh
  • In tcsh (or csh): source /opt/intel/intel-11.csh

If your program links MKL dynamically, it has to know where to find the correct MKL shared libraries at run time. Bear in mind that MKL 10.2 is not the default on the compute nodes, either. The easiest way to ensure correct behavior at run time is to put the same line into your batch script:

  • In a batch sh-script: source /opt/intel/intel-11.sh
  • In a batch csh-script: source /opt/intel/intel-11.csh

Otherwise the instructions for linking MKL 10.2 are identical to the instructions for MKL 10.3.6 and Intel 12.1 above. There is one exception: you need to set MKLPATH = $MKLROOT/lib/em64t and IOMPPATH = $MKLROOT/../lib/intel64.

The Link Line Advisor can be applied to older versions of the Intel compilers and MKL. It's well worth a visit.

Much more information on linking MKL 10.2 can be found in the Sec. 5 of the User Guide, which you can access from the login node ("firefox /opt/intel/Compiler/11.1/072/Documentation/en_US/mkl/userguide.pdf").

FAQ

How do I determine my program's dependencies on shared library (.so) files?
  • ldd - see the man page.

If your program cannot find all the .so files it needs, you may need to add paths to the LD_LIBRARY_PATH shell variable.

How do I display an image file (such as jpeg or gif)?
  • display mypic.jpg - uses one of the many ImageMagick tools - see "man ImageMagick" for help on this and various file format converters.
  • firefox mypic.jpg - any decent Web browser can handle it.

Note, the image will show up only if you have X11 forwarding enabled.

How do I use revision control?
  • Subversion, Git and CVS are examples of revision control (or version control or source control) software, which means they help you collaborate with others on revising your source code by saving versions of the code as you write it. Clients for all three are installed on the login nodes. See the man pages for svn, git and cvs for details. To see the installed versions, type the commands with --version.

CIT runs a free TeamForge server for Subversion users. You can login with Cornell Single Sign-on. There is also a GitHub server that is intended for users in Engineering, CIS, and Cornell Tech.