Skip to content
forked from gsneha26/SegAlign

A Scalable GPU System for Pairwise Whole Genome Alignments

License

Notifications You must be signed in to change notification settings

yatisht/SegAlign

 
 

Repository files navigation

License Build Status Published in SC20

A Scalable GPU System for Pairwise Whole Genome Alignments based on LASTZ's seed-filter-extend paradigm.

Table of Contents

Overview

The system has been tested on all the AWS G3 and P3 GPU instances with AMI Ubuntu Server 18.04 LTS (HVM), SSD Volume Type (ami-0fc20dd1da406780b (64-bit x86))

    $ git clone https://github.com/gsneha26/SegAlign.git
    $ export PROJECT_DIR=$PWD/SegAlign

Dependencies

The following dependencies are required by SegAlign:

  • NVIDIA CUDA 10.2 toolkit
  • CMake 3.8
  • Intel TBB library
  • libboost-all-dev
  • parallel
  • zlib
  • LASTZ 1.04.03
  • faToTwoBit, twoBitToFa (from kentUtils)

The dependencies can be installed with the given script as follows, which might take a while (only installs the dependencies not present already). This script requires sudo to install most packages at the system level. Using the -c option skips CUDA installation.

    $ cd $PROJECT_DIR
    $ ./scripts/installUbuntu.sh

How to run SegAlign

  • Run SegAlign
    $ run_segalign target query [options]
  • For a list of options
    $ run_segalign --help

Running a test

    $ cd $PROJECT_DIR
    $ mkdir test
    $ cd test
    $ wget https://hgdownload.soe.ucsc.edu/goldenPath/ce11/bigZips/ce11.2bit
    $ wget https://hgdownload-test.gi.ucsc.edu/goldenPath/cb4/bigZips/cb4.2bit 
    $ twoBitToFa ce11.2bit ce11.fa
    $ twoBitToFa cb4.2bit cb4.fa
    $ run_segalign ce11.fa cb4.fa --output=ce11.cb4.maf

How to run SegAlign repeat masker

  • Run SegAlign repeat masker
    $ run_segalign_repeat_masker sequence [options]
  • For a list of options
    $ run_segalign_repeat_masker --help

Running a test

    $ cd $PROJECT_DIR
    $ mkdir test_rm
    $ cd test_rm
    $ wget https://hgdownload.soe.ucsc.edu/goldenPath/ce11/bigZips/ce11.2bit
    $ twoBitToFa ce11.2bit ce11.fa
    $ run_segalign_repeat_masker ce11.fa --output=ce11.seg

Citing SegAlign

S. Goenka, Y. Turakhia, B. Paten and M. Horowitz, "SegAlign: A Scalable GPU-Based Whole Genome Aligner," in 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Atlanta, GA, US, 2020 pp. 540-552. doi: 10.1109/SC41405.2020.00043

About

A Scalable GPU System for Pairwise Whole Genome Alignments

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 48.6%
  • Cuda 39.6%
  • C 6.5%
  • Shell 4.1%
  • CMake 1.2%