Skip to content
/ ullar Public

An ULtrafast, scaLable, Accessible, and Reproducible phylogenomic pipeline

License

Notifications You must be signed in to change notification settings

hhandika/ullar

Repository files navigation

ULLAR ullar logo

ci

ULLAR, named after ular, which means snakes in the Indonesian language, stands for an Ultrafast, scaLable, Accessible, and Reproducible pipeline for phylogenomics. Our goal with ULLAR is to develop a lightweight and scalable pipeline that requires a minimal learning curve. In addition to Linux and MacOS, the typical supported operating systems for bioinformatics, whenever possible, ULLAR will run natively on Windows.

Development Status

ULLAR is currently under development. We are working on the pipeline's core components. You should expect command changes in the future release. If you use ULLAR in publication, we recommend stating the exact version of the app. For manual compilation, we recommend to also state the commit hash number. For example, ULLAR v0.3.0 (commit: f18ac98).

Try ULLAR

You can try the pipeline by following the installation guide below. This guideline assume familiarity of using command line app and basic bioinformatics tools.

Installation

Currently, ULLAR installation requires Rust. Follow Rust installation guide here. After installing Rust, you can install ULLAR using cargo:

cargo install --git https://github.com/hhandika/ullar.git

Another option is to install ULLAR pre-compiled binary. You can download the latest release from the release page. Available binaries:

OS Download
Linux Intel/AMD 64-bit or Many Linux Intel/AMD 64-bit
Windows Intel/AMD 64-bit
MacOS Intel or M series

Install ULLAR like installing any single executable binary. For example, in Linux:

tar -xvf ullar-Linux-x86_64.tar.gz

Copy to your bin directory such as /usr/local/bin:

sudo cp ullar /usr/local/bin

or our home directory that is in the PATH if you don't have root access:

cp ullar ~/bin

SEGUL provide a detailed installation guide on installing Rust based software here

Features & Dependencies

Feature Dependencies
Raw read cleaning Fastp
De novo assembly SPAdes
Reference mapping LASTZ
Sequence alignment MAFFT
ML phylogeny (in development) IQ-TREE
MSC phylogeny (in development) ASTER
Data cleaning SEGUL
Summary statistics (in development) SEGUL

NOTE: Summary statistics and other data cleaning feature is under development, but you can install SEGUL separately. Check out SEGUL documentation here

You can check if you have the dependencies installed by running the following commands:

ullar deps check

If you don't have the dependencies installed, you can install by following the instructions on the links provided above.

Check ULLAR installation:

ullar --version

Generate a config file

ullar new -d /raw_read_dir

To check the config file:

cat configs/clean_read.yaml

For more descriptive names, you can use the --sample-name descriptive argument:

ullar new /raw_read_dir --sample-name descriptive

Example of descriptive names:

- sample1_Species1_R1.fastq.gz
- sample1_Species1_R2.fastq.gz
- genus1_species1_locality_R1.fastq.gz
- genus1_species1_locality_R2.fastq.gz
- genus1_species2_locality_R1.fastq.gz
- genus1_species2_locality_R2.fastq.gz

If your file naming is simple, you can use the --sample-name simple argument:

ullar new /raw_read_dir --sample-name simple

Example of simple names:

- sample1_R1.fastq.gz
- sample1_R2.fastq.gz

You can also supply your own regular expression to extract the sample name:

ullar new /raw_read_dir --re-sample='([a-zA-Z0-9]+)_R1.fastq.gz'

Cleaning raw reads

ullar clean init -c configs/read_cleaning.yaml

To run the cleaning process:

ullar clean run -c configs/read_cleaning.yaml

It will first check the config file and the hash values match the raw reads. For a fresh run, you can skip the hash check:

ullar clean -c configs/read_cleaning.yaml --skip-config-check

De Novo Assembly

ULLAR uses SPAdes for de novo assembly. To run the assembly:

ullar assemble -c configs/denovo_assembly.yaml

Reference Mapping

ULLAR uses LASTZ for reference mapping. To run the reference mapping:

ullar map run -c configs/reference_mapping.yaml

Sequence Alignment

ULLAR uses MAFFT for sequence alignment. To run the sequence alignment:

ullar align run -c configs/sequence_alignment.yaml

About

An ULtrafast, scaLable, Accessible, and Reproducible phylogenomic pipeline

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages