Skip to content

Tutorial 3: CVMFS Software Access

Suchandra Thapa edited this page Jun 13, 2014 · 2 revisions

Introduction

This page introduces the user to SkeletonKey accessing remote software using SkeletonKey. After reading through this page, the user should be able to setup jobs that run software being hosted on CVMFS servers.

Prerequisites

The following items are needed in order to complete this tutorial:

  1. Webserver where the user can place files to access using the web
  2. HTCondor Cluster (optional)
  3. A working SkeletonKey install
  4. A squid proxy for Parrot to use
  5. Familiarity with basic usage of SkeletonKey (the first tutorial is sufficient)

Conventions

In the examples given in this tutorial, text in red denotes strings that should be replaced with user specific values. E.g. the URL for the user's webserver. In addition, this tutorial will assume that files can be made available through the webserver by copying them to ~/public_html on the machine where SkeletonKey is being installed.

CVMFS

CVMFS is a remote access protocol that allows a read-only filesystem to be exported using a webserver. Using FUSE or Parrot, this filesystem can be mounted on a system and will appear to be a local filesystem and can be used to run applications installed on the exported filesystem. The following example will show how to mount a CVMFS repository and use it to run your applications.

Using CVMFS for software access with SkeletonKey

Creating the application tarball Since we'll be running an application from a CVMFS repository, we'll create an application tarball to do some initial setup and then run the actual application

  1. Create a directory for the script

    [user@hostname ~]$ mkdir /tmp/cvmfs_access

  2. Create a shell script, /tmp/cvmfs_access/myapp.sh with the following lines:

    #!/bin/bash export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/cvmfs/uc3.uchicago.edu/sw/lib /cvmfs/uc3.uchicago.edu/sw/bin/Rscript ./cvmfs_access/test.R echo "Finishing script at: " echo date

  3. Create a R script /tmp/cvmfs/test.R with the following lines:

    hilbert<-function(n) 1/(outer(seq(n),seq(n),"+")-1) print("hilbert n=500") print(system.time(eigen(hilbert(500)))) print("hilbert n=1000") print(system.time(eigen(hilbert(1000)))) print("sort n=6") print(system.time(sort(rnorm(10^6)))) print("sort n=7") print(system.time(sort(rnorm(10^7)))) # loess loess.me<-function(n) { print(paste("loess n=",as.character(n),sep="")) for (i in 1:5) { x<-rnorm(10^n); y<-rnorm(10^n); z<-rnorm(10^n) print(system.time(loess(z~x+y))) } } loess.me(3) loess.me(4)

  4. Next, make sure the myapp.sh script is executable and create a tarball:

    [user@hostname ~]$ chmod 755 /tmp/cvmfs_access/myapp.sh [user@hostname ~]$ cd /tmp [user@hostname ~]$ tar cvzf cvmfs_access.tar.gz cvmfs_access

  5. Then copy the tarball to your webserver

    [user@hostname ~]$ cd /tmp [user@hostname ~]$ cp cvmfs_access.tar.gz ~/public_html [user@hostname ~]$ chmod 644 ~/public_html/cvmfs_access.tar.gz

  6. Finally, download the CVMFS repository key at http://uc3-data.uchicago.edu/uc3.key and make this available on your webserver

One thing to note here is that Parrot makes mounted CVMFS repositories available under /cvmfs/repository_name where repository_name is replaced by the name that the repository is published under. In addition, if your application has it's own libraries, you'll probably need to alter the LD_LIBRARY_PATH to point to the location of these library files as the myapp.sh file does in this example.

Creating a job wrapper

You'll need to do the following on the machine where you installed SkeletonKey

  1. Open a file called cvmfs_access.ini and add the following lines:

    [CVMFS] repo1 = uc3.uchicago.edu repo1_options = url=http://uc3-cvmfs.uchicago.edu/opt/uc3/,pubkey=http://repository_key_url,quota_limit=1000,proxies=squid-proxy:3128 repo1_key = http://repository_key_url

    [Parrot] location = http://your.host/parrot.tar.gz

    [Application] location = http://your.host/cvmfs_access.tar.gz script = ./cvmfs_access/myapp.sh

    In cvmfs_access.ini, change the url http://your.host/parrot.tar.gz to point to the url of the parrot tarball that you copied previously. The squid proxy setting will also need to be changed to point to your squid proxy and the repository_key_url link should be changed to the location of the CVMFS repository key that you uploaded.

  2. Run SkeletonKey on cvmfs_access.ini:

    [user@hostname ~]$ skeleton_key -c cvmfs_access.ini

  3. Run the job wrapper to verify that it's working correctly

    [user@hostname ~]$ sh ./job_script.sh

Using the job wrapper

Standalone

Once the job wrapper has been verified to work, it can be copied to another system and run:

[user@hostname ]$ scp job_script %REDanother_host:/ [user@hostname ~]$ ssh another_host [user@another_host ~] sh ./job_script

Submitting to HTCondor (Optional)

The following part of the tutorial is optional and will cover using a generated job wrapper in a HTCondor submit file.

  1. On your HTCondor submit node, create a file called sk.submit with the following contents

    universe = vanilla notification=never executable = ./job_script.sh output = /tmp/sk/test_$(Cluster).$(Process).out error = /tmp/sk/test_$(Cluster).$(Process).err log = /tmp/sk/test.log ShouldTransferFiles = YES when_to_transfer_output = ON_EXIT queue 1

  2. Next, create /tmp/sk for the log and output files for condor

    [user@condor-submit-node ~] mkdir /tmp/sk

  3. Then copy the job wrapper to the HTCondor submit node

    [user@hostname ]$ scp job_script.sh condor-submit-node:/

  4. Finally submit the job to HTCondor and verify that the jobs ran successfully

    [user@hostname ~]$ ssh condor-submit-node [user@condor-submit-node ~] condor_submit sk.submit