Skip to content

Latest commit

 

History

History
225 lines (150 loc) · 9.22 KB

INSTALL.md

File metadata and controls

225 lines (150 loc) · 9.22 KB

METALNX MSI

Installation Guide

Copyright © 2015-2017 Dell EMC.

This software is provided under the Software license provided in the LICENSE file.

The information in this file is provided “as is.” Dell EMC makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.


#TABLE OF CONTENTS

  1. Introduction
  2. Overview
  3. Resolving Metalnx MSI dependencies
  1. Installing The Metalnx Micro Services
  1. Using Metalnx Microservices

Introduction

Metalnx MSI is a set of microservices designed to work alongside the iRODS (integrated Rule-Oriented Data System). It provides automatic metadata extraction funcionalities for different types of files.

This installation guide will provide information on how to install the components necessary to run Metalnx MSI along with your current Metalnx installation.

Assumptions

In this installation guide, to fully install Metalnx MSI package, we will:

  • Show how to install the Metalnx micro services package which provides automated metadata extraction for .jpeg, .bam, .cram, .vcf, and some Illumina manifest files

Metalnx MSI has been tested on the following Linux distributions as indicated:

  • CentOS 7 – all functional testing performed.
  • Ubuntu 14 – all functional testing performed.

Metalnx MSI will work with iRODS 4.1 or later. It has been tested the most using iRODS 4.1.8.

Back to: Table of Contents

Metalnx MSI overview

Metalnx MSI package provides a few iRODS microservices implementations for automatic metadata extraction. The supported file types handled by this set of microservices are:

  • Image files (JPG, JPEG, PNG)
  • SAM files (SAM, BAM, CRAM)
  • Illumina projects (entire Illumina project compressed file)
  • Variant call files (VCF)

Back to Table of Contents

Resolving Metalnx MSI dependencies

At a high level Metalnx MSI is dependent on the following software components being available:

libxml2 and libexif

The libxml2 package is required for make it possible for the microservices to read xml files such as some manifest files for genetic researches. These files usually contain important metadata that can be assigned to project files on an iRODS grid.

Image files also contain metadata that can be extracted and assigned to the original file residing inside an iRODS grid. Image metadata are standardized by the EXIF model. The libexif library provides a way of extracting this information from files.

Both libraries are available on official repositories for CentOS7 and Ubuntu 14. On CentOS7 they can be installed with the following command:

$ sudo yum install -y libexif-devel libxml2-devel

On Ubuntu the same can be achieved with:

$ sudo apt-get -y install libexif-dev libxml2-dev

irods-devel package

The irods-devel package can be downloaded from the iRODS download page. On CentOS 7 it can be installed with the following command:

$ sudo yum install irods-dev-4.1.8-centos7-x86_64.rpm

Note: The yum command accepts local RPM files as input. When using yum to install RPM packages, it will automatically fetch all the dependencies from the central repository. The irods-dev-4.1.8-centos7-x86_64.rpm package used in the last command is a local file.

YUM will resolve the dependencies for the irods-devel package and will fetch them from the central repository.

The same operation can be executed on Ubuntu 14 with the following commands:

$ sudo dpkg -i irods-dev-4.1.8-ubuntu14-x86_64.deb
$ sudo apt-get -f install

Installing and configuring htslib and samtools

The samtools library allows Metalnx microservices to get information from SAM, BAM and CRAM files. These libraries binaries are not available on CentOS7 nor Ubuntu 14 offical repositories. It implies that they must be manually compiled, configured and installed.

Note: A C++ compiler must be present in the machine you are working on. If you do not have it, you can install it with the following command:

$ yum install -y gcc-c++    # CentOS
$ apt-get install -y g++    # Ubuntu

First of all, download the samtools from the official website:

$ wget https://github.com/samtools/samtools/releases/download/1.3.1/samtools-1.3.1.tar.bz2

Unzip it:

$ tar -xvf samtools-1.3.1.tar.bz2 

Inside the samtools-1.3.1 folder, there is also the htslib-1.3.1 source code. It means that you won't need to download the htslib separately.

Go inside the htslib-1.3.1 directory and build the source files:

$ ./configure
$ make
$ sudo make install

The default installation process of the htslib copies the shared object libhts.so.1.3.1 to the /usr/local/lib directory which is not the default search location for libraries. We need to configure linker to look for it on its installation path. To do so, use the following commands:

$ echo "/usr/local/lib" > /etc/ld.so.conf.d/htslib.conf
$ sudo ldconfig

To confirm that the operation was successfully executed:

$ sudo ldconfig -p | grep hts
        libhts.so.1 (libc6,x86-64) => /usr/local/lib/libhts.so.1
        libhts.so (libc6,x86-64) => /usr/local/lib/libhts.so

The next step is to install the samtools files. The installation is pretty straightforward:

$ cd samtools-1.3.1
$ ./configure --without-curses
$ make
$ sudo make install
$ sudo cp *.h /usr/local/include/

Back to Table of Contents

Installing Metalnx Microservices

Now that all the dependencies are satisfied, we need to install the metalnx-msi-plugins package. This section explains how to install it on RPM and DEB-based platforms.

Back to Table of Contents

Install procedure on RPM-based systems

In order to get the metalnx-msi-plugins package installed on CentOS 7, execute:

$ rpm -ivh --nodeps metalnx-msi-plugins-1.0-centos7.rpm	

Notice that the --nodeps option is used. It happens because the libhts.so is only used at runtime. We don't need to configure the rpm installation process to search on the linker directories.

Back to Table of Contents

Install procedure on DEB-based systems

In order to get the metalnx-msi-plugins package installed on Ubuntu 14, execute:

$ sudo dpkg -i metalnx-msi-plugins-1.0.deb

Back to Table of Contents

Verifying installed files

Once installed, the metalnx-msi-plugins will copy all the shared objects implementations to the default iRODS microservices folder. To make sure the files have been copied to the target directory, cd to the /var/lib/irods/plugins/microservices directory and make sure you see the following files:

  • libmsiget_illumina_meta.so
  • libmsiobjjpeg_extract.so
  • libmsiobjput_mdbam.so
  • libmsiobjput_mdmanifest.so
  • libmsiobjput_mdvcf.so
  • libmsiobjput_populate.so

Back to Table of Contents

Using Metalnx Microservices

The microservices are triggered by file types during the upload function on the Metalnx UI. It means that you upload a file called photo.jpg, Metalnx will try to execute the jpeg_extract microservice to get metadata from the file.

All the operations executed by the microservices are traceable by the iRODS logs. These logs can be accessed on the /var/lib/irods/iRODS/server/log directory and they have a name format rodsLogs.YYYY.MM.DD.

Notice that if your infrastructure contains multiple iRODS servers (iCAT and resources) the `metalnx-msi-plugins´ package must be installed in all of the servers. The microservices are executed locally on each one of the machines. It means that if a user uploads a file to a resource server and it does not have the Metalnx MSI package installed, the metadata extraction will fail.

Back to Table of Contents