Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
scharch authored May 29, 2024
1 parent e5ad27a commit ec86fa8
Showing 1 changed file with 39 additions and 18 deletions.
57 changes: 39 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,51 @@
<p align="center">
<image src=https://github.com/scharch/aligator/assets/6708960/c6acd3d9-d082-4b0b-9f09-c99c7f8f651a>
</p>

# ALIGaToR - Annotator of Loci for IG and T-cell Receptors
A pipeline for annotating genomic contigs from the IG and TR loci. The pipeline includes:
- Extract: A parsing script that extracts gene, exon, and RSS name and corrdinates from reference annotations of choice of closely related species.
- Predict: A prediction script calls submodule DnaGrep, that predicts RSS sequences based on genomic contigs.
- Annotate: Annotator script that uses the extracted reference genome and genomic information to generate a search databse for blast. Blast hits are matched with predicted RSSs. Other scripts are called to check for start and stop codons, and splice sites.

## Getting Started
Clone the aligator repository
git clone https://github.com/scharch/aligator.git

## Dependencies/Prerequisites
- Python
- Beautifulsoup 4.12.3
- Python 3.6 or greater
- Muscle
- Blast+
- pyBedTools

## Usage
aligator --help
### Example
#Download BK063715 fasta file from IMGT.org
#extract IGH annotations from IMGT's rheMac10
aligator extract https://imgt.org/ligmdb/view.action?id=BK063715 BK063715
- BedTools

## Getting Started
Clone the aligator repository:

git clone https://github.com/scharch/aligator.git

Install required python packages:

pip install -r aligator/requirements.txt

Set enviromental variable:

export ALIGATOR_PATH=$(pwd)/aligator

Quick help:

`aligator help`


## Vignette annotating MF989451 from Ramesh et al Frontiers Immunology 2017:
Data is in `aligator/sample_data`.

First, get reference genome from IMGT:

#Download BK063715 fasta file from https://imgt.org/ligmdb/view.action?format=FASTA&id=BK063715
#Then create bedfile with reference annotations
aligator extract https://imgt.org/ligmdb/view.action?id=BK063715 BK063715

Find possible RSS motifs in the target contig. For MF989451, the output should look the same as `sample_data/MF989451.rss12_pred.bed` and `sample_data/MF989451.rss23_pred.bed`:

#predict RSS for MF989451 and compare to sample data
aligator predict /sample_data /sample_data/MF989451.fa MF989451
aligator predict $ALIGATOR_PATH/sample_data/MF989451.fa MF989451

Finally, annotate the target contig. For MF989451, the actual annotations provided by Ramesh et al are included as `sample_data/MF989451.ground_truth.bed`:

#annotate MF989451 and compare to sample data
aligator annotate /sample_data/MF989451.fa /sample_data/MF989451.rss12_pred.bed MF989451.rss23_pred.bed IGH BK063715.fasta BK063715.bed --alleledb coding.fa --outgff annotations.gff --outfasta IgGenes.fa --blast blastn
aligator annotate $ALIGATOR_PATH/sample_data/MF989451.fa MF989451.RSS12.bed MF989451.RSS23.bed IGH BK063715.fasta BK063715.bed

0 comments on commit ec86fa8

Please sign in to comment.