Archival Storage

From CAC Documentation wiki
Jump to navigation Jump to search

What is CAC Archival Storage?

  • CAC Archival Storage is a low-cost, high-performance option for storing research data available only to users within Cornell University.
  • CAC Archival Storage is not mountable by running jobs, instead the user must transfer their data from the CAC Archival Storage to an accessible server using Globus Online.
  • Globus Online users have easy access to add, delete, and share their data using any Globus Online endpoints.
  • Some of the Globus Online endpoints available include:
    • (cac#home) where all CAC user home directories are found
    • XSEDE sites:
      • Stampede (xsede#stampede)
      • Lonestar (xsede#lonestar4)
      • TACC Archival Storage (xsede#ranch).

First step - Enable (or create) CAC project for Archival Storage and add users where appropriate

  • To use the CAC Archival Storage service, you must be a user of a CAC project where Archival Storage is enabled.
  • The project PI can add users and verify that Archival Storage is enabled at the Manage CAC project page.
  • Don't have a project? How to start a CAC project?.

Second step - create your Globus Online account

Sign up for a Globus account. CAC's Archival system is only accessible through Globus Online.

Globus Online links

CAC specifics

Technical Information

CAC's EndPoint is cac#archive01.

  • When activating cac#archive01 endpoint in Globus Online web GUI, you will be prompted by a dialog box saying:

The administrator of this endpoint, cac#archive01, requires that you authenticate using their MyProxy OAuth server to activate the endpoint. When you click 'Continue' you will be redirected to their website.

  • You will be redirected to the page.
  • Enter your CAC credentials.
  • When login is successful, you will be redirected back to Globus Online web GUI with the endpoint activated.

Administrative Information

  • cac#archive01's default path is /export.
  • Each project with access to CAC Archival Storage has a shared directory (named the project) in which all project members have full read/write access.
  • Users can rename and move files and directories within their project directory on the endpoint. Globus Online added this feature recently.

Advanced Topic - Automating transfers to the Archival Storage

  • Install Globus Connect Personal on the Linux/MacOS/Windows host you wish to archive by clicking on the "Get Globus Connect Personal" link on the Transfer Files screen on Globus.
Install Globus Connect Personal.jpg
to globusconnectpersonal-2.0.3/ (on line ~ 360):
 args = [os.path.basename(PDEATH_LAUNCH),
                "-i", "-always-send-markers",
                "-hostname", "",
  • Copy root-bin directory from the archive_scripts.tar.gz to /root/bin. If you are archiving directories outside /home, modify the -restrict-path argument in /root/bin/
  • Generate a ssh key pair using the "ssh-keygen" command, leave private key in ~/.ssh, and upload the private key to Globus
Upload ssh private key.jpg
  • Make sure you can access Globus CLI like this:
ssh -i .ssh/<private key> <globus user name>
  • Modify to match your Globus user name, private key file name, CAC project and archive directory.
  • On Globus, make sure your connection to cac#archive01 endpoint is activated.
  • You should now be able to run to upload your archive directory to CAC archive. You can automate this script using cron.

Advanced Topic: Syncing to Archival Storage

See here for how to sync data to Archival Storage

Globus Online links