GitHub - edwardbirdlab/BALROG-MON: A Nextflow pipeline for Antimicrobial Resistatnce exploration in metagenomic samples

BALROG-MON

Bacterial Antimicrobial Resistance annOtation of Genomes - Metagenomic Oxford Nanopore

About BALROG-MON

BALROG-MON (Bacterial Antimicrobial Resistance annOtation of Genomes - Metagenomic Oxford Nanopore) is a comprehensive high throughput Nextflow pipeline built to utilize Q20+ Oxford Nanopore long-reads for the investigation of bacterial antimicrobial resistance (AMR) and its mobility from metagenomic samples. While AMR characterization is the main goal of BALROG-MON, it also provides subworkflows for many related analyses customizable to users' needs, such as assembly-free annotation, pathogen detection, and metagenomic community analysis of bacteria, viruses, and other microorganisms in samples.

Note

Updates to BALROG-MON may occur periodically to help continually improve the pipeline. If you have any requests or recommended changes you'd like to see (i.e. usage with other data types), please reach out via email ([email protected] | [email protected]) or request feature.

If you experience any trouble or find bugs when running BALROG-MON, please report issues or bugs and they will be addressed as soon as possible.

Not the BALROG pipeline you're looking for?

BALROG-MSR: Bacterial Antimicrobial Resistance annOtation of Genomes - Metagenomic Short Read
BALROG-ISO: Bacterial Antimicrobial Resistance annOtation of Genomes - ISOlate whole genomes

Workflow Overview

*See sections below for details on subworkflows

Getting Started

Before you get too far along, familiarize yourself with this section to make sure this is the pipeline for you and your equipment and samples can meet the requirements. (Don't worry, there isn't too much to do).

1. What Data Do I Need?

BALROG-MON in its current form expects Q20+ Oxford Nanopore Long Read Metagenomic Sequencing. BALROG-MON can run in "Assembly-Free" mode or assembles a metagenome using metaFlye, allowing for the analysis of low and high coverage metagenomes. BALROG-MON in its standard configuration will require 100GB of RAM.

Note

If you would like to run BALROG-MON with older, non-Q20+ Nanopore data, feel free to request feature.

2. Dependencies

All dependencies are managed via Docker Containers and hosted on DockerHub. One of the following container runtime software packages will be required:

Nextflow (>= 23.04.0.5857) - Install Nextflow
Docker/Singularity/Apptainer - Install Docker - Install Singularity - Install Apptainer

3. Installation

Preferred Method - Download Release

wget https://github.com/edwardbirdlab/BALROG-MON/releases/download/v0.0.0/BALROG-0.0.0.tar.gz
tar -xzf BALROG-0.0.0.tar.gz

Method 2 - Clone Repo

git clone https://github.com/edwardbirdlab/BALROG-MON

4. Creating a Sample Sheet

BALROG-MON takes a CSV (Comma-Seperated-Value) sheet as the input. Note that the "sample" column will be the prefix of all output files for that sample.

Example Format:

sample,path,reference_genome
Sample_Name_1,/absolute/path/to/sample1.fastq.gz,/absolute/path/to/reference_genome_1.fna
Sample_Name_2,/absolute/path/to/sample2.fastq.gz,/absolute/path/to/reference_genome_1.fna

5. Nextflow Configuration

When creating a Nextflow config, ensure a container runtime is enabled (Singularity/Apptainer/Docker). If you are using Slurm, you can use the incuded Beocat Slurm config as a template. Most nf-core configs will also be supported. If you have never created a Nextflow config, or are having issues, reach out to your local administration.
Nextflow Configuration - nf-core configs

6. Pipeline Configuration

If you want to change any parameters of BALROG-MON from its default options, they can be changed using the "nextflow.config" file. Configurable parameters will be outlined in the detailed sections below, as well as in the config file.

(back to top)

Running BALROG-MON

Running the whole pipeline

nextflow run /path/to/edwardbirdlab/BALROG-MON -c /path/to/config.cfg

Generate Multi-QC

nextflow run /path/to/edwardbirdlab/BALROG-MON -c /path/to/config.cfg --workflow-opt multiqc

(back to top)

Core Steps of Workflow

1. Preprocessing

Trimming & Raw QC

FastQC : Raw Read
Porechop
chopper
Parameters
- params.chopper_minlen = (defualt = 500)
- params.chopper_averagequality = (defualt = 20)
FastQC : Trimmed Read

Final Read QC

MultiQC

2. Read-Based Identification

Pathogen Detection (Core Step for "Assembly Free" Only)

Kraken 2 (standard database)

3. Sequence Processing

Assembly

"Assembly Free"
- Seqtk : Convert fastq to fasta
OR
"Assembled"
- metaFlye : Metagenomic assembly
- Kraken 2 (standard database) : Reassign sequence identities

Sequence Processing QC

QUAST

4. ARG & Mobility Annotation

Plasmer : Plasmid prediction
Parameters
- params.plasmer_min_len = (defualt = 500)
- params.plasmer_max_len = (defualt = 500000)
CARD

5. Binning

(back to top)

Optional Steps of Workflow

1. Preprocessing

Standardize Read Names

Included Python script (useful if you have long read names)

Remove Human DNA

minimap2 : Mapping to human genome
SAMtools : Extracting non-human reads names
Seqtk : Extract non-human reads

Remove Host DNA

minimap2 : Mapping to host genome
SAMtools : Extracting non-host reads names
Seqtk : Extract non-host reads

2. Read-Based Identification

Pathogen Detection (Optional for "Assembled" Only)

Kraken 2 (standard database)
Parameters
- report-minimizer-data
- minimum-hit-groups 3

Community Analysis

Note

BALROG-MON does not create a graphical summary of pathogen detection and community analysis results. However, results are readily compatible for visualization using Pavian.

Kraken 2 (standard database)
Bracken

4. ARG and Mobility Annotation

Multi AMR Annotation

Note

CARD is run by defualt, however it can be switched to include additional ARG databases by setting params.cardonly = TRUE

(back to top)

Citations

As there is currently no paper associated with BALROG-MON, please cite this Github page. Also, I feel free to contact me ([email protected] | [email protected]) to let me know!

Many tools are used in this pipeline and its respective options. See 'CITATION.md' for the list of all tools used in this pipeline.

(back to top)

License

Distributed for the USDA ARS under the Public Domain. See LICENSE for more information.

(back to top)

Contact

Edward Bird - - [email protected] | [email protected]

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 475 Commits
Archive_NFParser		Archive_NFParser
configs		configs
images		images
modules		modules
subworkflows		subworkflows
unused		unused
workflows		workflows
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
REFERENCES.md		REFERENCES.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BALROG-MON

About BALROG-MON

Not the BALROG pipeline you're looking for?

Workflow Overview

Table of Contents

Getting Started

1. What Data Do I Need?

2. Dependencies

3. Installation

4. Creating a Sample Sheet

5. Nextflow Configuration

6. Pipeline Configuration

Running BALROG-MON

Core Steps of Workflow

1. Preprocessing

2. Read-Based Identification

3. Sequence Processing

4. ARG & Mobility Annotation

5. Binning

Optional Steps of Workflow

1. Preprocessing

2. Read-Based Identification

4. ARG and Mobility Annotation

Citations

License

Contact

About

Releases

Packages

Contributors 3

Languages

License

edwardbirdlab/BALROG-MON

Folders and files

Latest commit

History

Repository files navigation

BALROG-MON

About BALROG-MON

Not the BALROG pipeline you're looking for?

Workflow Overview

Table of Contents

Getting Started

1. What Data Do I Need?

2. Dependencies

3. Installation

4. Creating a Sample Sheet

5. Nextflow Configuration

6. Pipeline Configuration

Running BALROG-MON

Core Steps of Workflow

1. Preprocessing

2. Read-Based Identification

3. Sequence Processing

4. ARG & Mobility Annotation

5. Binning

Optional Steps of Workflow

1. Preprocessing

2. Read-Based Identification

4. ARG and Mobility Annotation

Citations

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages