Skip to content

Pipeline module for haplotype/diplotype determination

Notifications You must be signed in to change notification settings

lumc-pgx/haplotyping

Repository files navigation

Haplotyping workflow

Comparison of allelic variants and haplotype definitions

The haplotyping workflow module performs the following operations:

  • Identification of allele variants shared with reference haplotypes.
  • Prioritisation of haplotype matches.
  • Identification of the highest scoring haplotype.

The pipeline outputs three files per barcode:

  • A {barcode}.matches.json file which contains all of the match information per barcode
  • A {barcode}.matches.txt file which contains a human readable summary of the per-allele matches
  • A {barcode}.haplotype.txt file which contains the haplotype assignment per allele

Requirements

Installation

  • Clone the repository

    • git clone https://github.com/lumc-pgx/haplotyping.git
  • Change to the haplotyping directory

    • cd haplotyping
  • Create a conda environment for running the pipeline

    • conda env create -n haplotyping -f environment.yaml
  • In order to use the pipeline on the cluster, update your .profile to use the drmaa library:

    • echo "export DRMAA_LIBRARY_PATH=libdrmaa.so.1.0" >> ~/.profile
    • source ~/.profile

Configuration

Pipeline configuration settings can be altered by editing config.yaml.

Execution

  • Activate the conda environment
    • source activate haplotyping
  • For parallel execution on the cluster
    • pipe-runner
  • To specify that the pipeline should write output to a location other than the default:
    • pipe-runner --directory path/to/output/directory

About

Pipeline module for haplotype/diplotype determination

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published