Archival Storage

From CAC Documentation wiki
Revision as of 14:50, 30 September 2015 by Ad876 (talk | contribs)
Jump to navigation Jump to search

What is CAC Archival Storage?

  • CAC Archival Storage is a low-cost, high-performance option for storing research data available only to users within Cornell University.
  • CAC Archival Storage is not mountable by running jobs, instead the user must transfer their data from the CAC Archival Storage to an accessible server using Globus Online.
  • Globus Online users have easy access to add, delete, and share their data using any Globus Online endpoints.
  • Some of the Globus Online endpoints available include:
    • storage01.cac.cornell.edu (cac#home) where all CAC user home directories are found
    • XSEDE sites:
      • Stampede (xsede#stampede)
      • Lonestar (xsede#lonestar4)
      • TACC Archival Storage (xsede#ranch).

First step - Enable (or create) CAC project for Archival Storage and add users where appropriate

  • To use the CAC Archival Storage service, you must be a user of a CAC project where Archival Storage is enabled.
  • The project PI can add users and verify that Archival Storage is enabled at the Manage CAC project page.
  • Don't have a project? How to start a CAC project?.

Second step - create your Globus Online account

Sign up for a Globus account. CAC's Archival system is only accessible through Globus Online.

CAC specifics

Technical Information

CAC's EndPoint is cac#archive01.

  • When activating cac#archive01 endpoint in Globus Online web GUI, you will be prompted by a dialog box saying:

The administrator of this endpoint, cac#archive01, requires that you authenticate using their MyProxy OAuth server to activate the endpoint. When you click 'Continue' you will be redirected to their website.

  • You will be redirected to the https://archive01.cac.cornell.edu/oath/authorize... page.
  • Enter your CAC credentials.
  • When login is successful, you will be redirected back to Globus Online web GUI with the endpoint activated.

Administrative Information

  • cac#archive01's default path is /export.
  • Each project with access to CAC Archival Storage has a shared directory (named the project) in which all project members have full read/write access.
  • Users can rename and move files and directories within their project directory on the endpoint. Globus Online added this feature recently.

Automated Archival

  • Install Globus Connect Personal on the Linux/MacOS/Windows host you wish to archive by clicking on the "Get Globus Connect Personal" link on the Transfer Files screen on Globus.
Install Globus Connect Personal.jpg
"-allow-root",
to globusconnectpersonal-2.0.3/gc.py (on line ~ 360):
 args = [os.path.basename(PDEATH_LAUNCH),
                GRIDFTP_SERVER,
                "-allow-root",
                "-i", "-always-send-markers",
                "-hostname", "127.0.0.1",
  • Copy root-bin directory from the archive_scripts.tar.gz to /root/bin. If you are archiving directories outside /home, modify the -restrict-path argument in /root/bin/gc_start.sh.
  • Generate a ssh key pair using the "ssh-keygen" command, leave private key in ~/.ssh, and upload the private key to Globus
Upload ssh private key.jpg
  • Make sure you can access Globus CLI like this:
ssh -i .ssh/<private key> <globus user name>@cli.globusonline.org
  • Modify archive.sh to match your Globus user name, private key file name, CAC project and archive directory.
  • On Globus, make sure your connection to cac#archive01 endpoint is activated.
  • You should now be able to run archive.sh to upload your archive directory to CAC archive. You can automate this script using cron.

Globus Online links

Syncing to Archival Storage

See here for how to sun to Archival Storage