Compiling Code Linux

From CAC Documentation wiki
Jump to navigation Jump to search

Use /tmp to compile large codes and software packages. This will provide improved performance and greater system stability.

If you want to know what processor features a cluster supports, submit a batch job that does "cat /proc/cpuinfo" in order to find out the CPU type. The v4 cluster is composed mostly of Intel E5420 CPUs (in Nov. 2011). Then you go to Wikipedia's Intel Xeon page or Intel's ARK to find that these are Harpertown cores that support SSE, SSE2, SSE3, SSSE3, SSE4.1 and VMX.

C/C++ and Fortran Codes
  • GNU compilers gcc, g++, g77, gfortran are in /usr/bin, which is in the default path.
    • For compiling OpenMP directives, add the option -fopenmp.
  • Intel 12.1 compilers icc, ifort are in the default path on the login nodes.
    • For compiling OpenMP directives, add the option -openmp.
    • The following Intel libraries and tools are available to you automatically through the default setup on the login nodes:
      - MKL, the Math Kernel Library 10.3.6 (additional help below)
      - idb, the Intel debugger for Linux
      - TBB, the Threading Building Blocks
      - IPP, the Integrated Performance Primitives
    • If any of the above libraries are linked dynamically, the correct runtimes will be loaded automatically on the compute nodes by default; no additional setup is required.
    • Note - if you find that your code segfaults after compiling with Intel 12.1, try disabling optimization or using the older 11.1 version of the compilers instead.
      Reason: there is a known bug in the vectorizer of the 12.1 compiler which is due to be fixed in a future release.
  • Intel 11.1 compilers icc, ifort are available, also, but these older compilers require special setup files.
    • Before compiling in bash: source /opt/intel/
    • Before compiling in tcsh: source /opt/intel/intel-11.csh
    • At runtime, in a batch sh-script: source /opt/intel/
    • At runtime, in a batch csh-script: source /opt/intel/intel-11.csh
    • The above steps also enable the use of the older Intel performance libraries, e.g., MKL 10.2 (additional information below).
  • Help for Intel compilers (if you are using 11.1, be sure to source the setup file first):
    • Fortran: man ifort, info ifort, ifort -help
    • C/C++: man icc, info icc, icc -help
  • Standard compiler options - the clusters have Intel Core2 processors, so standard compiler options are:
    • For Intel: -O3 -ipo -mtune=pentium4 -march=pentium4
  • Other options of possible interest (consult man pages):
    • For Intel: -fno-alias -align -scalar_rep -prefetch
Generating Debugging Info
  • Intel compilers
    • icc -Wall
    • ifort -g -debug -warn -C (-CB for bounds checking only)
MPI Programs

For compiling MPI codes, we recommend using mpicc and mpif90. If you specifically need a C++ compiler, try mpicxx. Because of these handy wrapper scripts, you may not need to do very much to convert existing makefiles to work with CAC's preferred software stack. Currently the default paths are set up so that the mpicc, mpif90 and mpicxx utilities invoke the Intel 12.1 compilers to compile your codes and link them properly to the Intel MPI 3.1 libraries. However, if you run the Intel 11.1 compiler setup file first, then these utilities will automatically use the older 11.1 compiler version. Documentation for the Intel MPI 3.1 Library, including mpdboot and mpiexec, is in PDF on the Intel Support Site.

To view a sample batch script that will run an MPI job for you, see the section on Running a parallel MPI job.

The ROCKS operating system comes with several alternate MPI implementations (e.g., mpich2, OpenMPI). You have to play with environment variables and paths to get them to work.

Intel MKL

Intel's Math Kernel Library (MKL) is a good source of optimized routines for linear algebra, Fast Fourier Transforms, vector math, and other mathematical operations. In particular, it provides a way to incorporate Intel's optimized BLAS and LAPACK routines into your code.

OpenMP multithreading is built into certain MKL libs. When these libs are linked, calls to MKL will detect the same settings that would affect any other OpenMP-enabled code. This means MKL will attempt to use all the cores present on a v4 node during the execution of parallelized sections. Therefore, when you link your code with a "_thread" version of the MKL library, your should realize that all your calls to MKL will generally fork the same number of threads as the number of cores present. This may cause undesired interference with other parallelization strategies you are using, e.g., MPI. If this is not the behavior you want, you can do one of two things:

  • Link with "mkl_sequential" (or -mkl=sequential in 12.1) instead of, e.g., "mkl_intel_thread" (or -mkl=parallel in 12.1).
  • At run time, set the OMP_NUM_THREADS environment variable to 1. (Use "export" or "setenv".) A value of 8 recovers the default behavior on v4.
Linking Intel MKL 10.3.6 with the Intel 12.1 Compilers

MKL 10.3.6 is the version installed with the 12.1 compilers. The easiest way to link MKL is to compile as follows, where the last two lines pertain to MPI codes:

  • icc mycode.c -o mycode -mkl
  • ifort mympicode.c -o mycode -mkl
  • mpicc mympicode.c -o mympicode -mkl
  • mpif90 mympicode.f90 -o mympicode -mkl

Note, -mkl is the same as -mkl=parallel, which enables OpenMPI mulithreading. If you don't want this, use -mkl=sequential.

With just plain -mkl (or -mkl=...), the resulting executable will be dynamically linked. This means that at run time, your program has to know where to find the MKL shared libraries. Since MKL 10.3.6 is the default, the appropriate paths have been predefined for you on the compute nodes, and your batch jobs should have no trouble.

Should you want to link MKL in some different way--e.g., statically--the compile line will start looking messier. Linking to MKL has become rather complicated due to Intel's decision to maximize MKL's flexibility and multi-platform compatibility by splitting out four separate layers of libraries: interface, threading, computational, and runtime (meaning OpenMP, if the _thread lib is requested). To make sure you have all these layers, we recommend appending one of the following snippets to your ifort, icc, mpif90, or mpicc command (after first setting MKLPATH = $MKLROOT/lib/intel64):

  • static, multithreaded:
    $MKLPATH/libmkl_solver_lp64.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a -Wl,--end-group -openmp -lpthread
  • static, sequential:
    $MKLPATH/libmkl_solver_lp64_sequential.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_sequential.a $MKLPATH/libmkl_core.a -Wl,--end-group -lpthread

These options will generate a (mostly) statically linked executable. Note, each .a-lib must be identified by its full path in order to prevent the .so-lib (its dynamic equivalent) from being found instead. If you do not need access to the MKL solver routines, simply remove that item from the head of the list. As noted previously, if your main program is itself threaded with OpenMP, or if it is parallelized with MPI, you may want to select libmkl_sequential.a in order to reduce contention and get better performance.

To generate a dynamically linked rather than statically linked executable, the above options become:

  • dynamic, multithreaded:
    $MKLPATH/libmkl_solver_lp64.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -Wl,--end-group -openmp -lpthread
  • dynamic, sequential:
    $MKLPATH/libmkl_solver_lp64_sequential.a -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group -lpthread

These sets of options are pretty much equivalent to -mkl=parallel and -mkl=sequential, respectively.

If your batch script needs strict control over LD_LIBRARY_PATH, then one other compiler/linker option may be helpful for a dynamically-linked code:

  • -Wl,-rpath,$MKLPATH,-rpath,$IOMPPATH

The above variables should be set to MKLPATH = $MKLROOT/lib/intel64 and IOMPPATH = $MKLROOT/../compiler/lib/intel64. This option "hardwires" the correct paths into the executable; these paths are valid on both the v4 compute nodes and the v4 login nodes. If you don't wish to restrict your executable in this fashion, the alternative is to add these paths manually to LD_LIBRARY_PATH.

Intel has put together a helpful tool for generating the correct linker options to match your specific needs, the Link Line Advisor. This is definitely the place to go if you want to, e.g., use extra-long integers or compile with gcc or gfortran. It's well worth a visit.

Much more information on linking MKL 10.3.6 can be found in the "Linking Your Application" section of the User Guide, which you can access from the login node ("firefox /opt/intel/composer_xe_2011_sp1.6.233/Documentation/en_US/mkl/mkl_userguide/index.htm").

Linking Intel MKL 10.2 with the Intel 11.1 Compilers

MKL 10.2 is the version installed with the 11.1 compilers. Since 11.1 is not the current default version of the compilers, you must first source the setup file:

  • In bash (or sh): source /opt/intel/
  • In tcsh (or csh): source /opt/intel/intel-11.csh

If your program links MKL dynamically, it has to know where to find the correct MKL shared libraries at run time. Bear in mind that MKL 10.2 is not the default on the compute nodes, either. The easiest way to ensure correct behavior at run time is to put the same line into your batch script:

  • In a batch sh-script: source /opt/intel/
  • In a batch csh-script: source /opt/intel/intel-11.csh

Otherwise the instructions for linking MKL 10.2 are identical to the instructions for MKL 10.3.6 and Intel 12.1 above. There is one exception: you need to set MKLPATH = $MKLROOT/lib/em64t and IOMPPATH = $MKLROOT/../lib/intel64.

The Link Line Advisor can be applied to older versions of the Intel compilers and MKL. It's well worth a visit.

Much more information on linking MKL 10.2 can be found in the Sec. 5 of the User Guide, which you can access from the login node ("firefox /opt/intel/Compiler/11.1/072/Documentation/en_US/mkl/userguide.pdf").