Skip to content

Latest commit

 

History

History
217 lines (188 loc) · 8.28 KB

File metadata and controls

217 lines (188 loc) · 8.28 KB

Dorado Docker Image

This Dockerfile sets up an environment for running Dorado, a tool for basecalling Fast5/Pod5 files from Oxford Nanopore sequencing.

Table of Contents

Introduction

This Docker image includes:

  • Dorado: Version 0.9.0, a tool for basecalling Oxford Nanopore sequencing data.
  • NVIDIA CUDA: Version 12.2.0, for GPU acceleration (requires NVIDIA GPU).
  • Pigz: Version 2.6, for parallel compression and decompression.
  • Pre-downloaded basecalling models: All models are downloaded during the build process for basecalling. They are located at /dorado_models/ in the container filesystem.

Requirements

  • Docker: Installed on your system.
  • NVIDIA GPU and Drivers: Installed and configured.
  • NVIDIA Container Toolkit: To enable GPU support in Docker containers.

Running the Docker Container

To run the Dorado tool within the Docker container, use the following command:

docker run --gpus all -it staphb/dorado:latest dorado --help

This command will display the help information for Dorado, confirming that it's installed correctly.

Testing the Docker Image

To test that Dorado is working correctly, you will need to download a sample Pod5 file and perform a basecalling operation using the pre-downloaded basecalling models.

wget -O dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 \
  https://github.com/nanoporetech/dorado/raw/release-v0.7/tests/data/pod5/dna_r10.4.1_e8.2_260bps/dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5

Basecalling Test

# Run the following command:
docker run --gpus all -v $(pwd):/usr/src/app -it staphb/dorado:latest bash -c "\
  dorado basecaller /dorado_models/[email protected] \
  /usr/src/app/dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 \
  --emit-moves > /usr/src/app/basecalled.sam"

Explanation:

  • --gpus all: Enables GPU support.
  • -v $(pwd):/usr/src/app: Mounts the current directory to /usr/src/app inside the container.
  • bash -c "...": Runs the basecalling command inside the container.
  • > /usr/src/app/basecalled.sam: Redirects the output to basecalled.sam in your current directory.

Verifying the Output

Check the output file to ensure basecalling was successful:

samtools view basecalled.sam

You should see SAM-formatted basecalling results.

Additional Notes

  • Sample Data: The sample Pod5 file is downloaded to /usr/src/app during the build.
  • Internal Testing: An internal test stage is included in the Dockerfile to verify installation.
  • Basecalling Models: All models are downloaded to /dorado_models during the build process. Below is the list of basecalling models included in the Docker image:
# modification models:
- [email protected]
- [email protected]_5mCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]
- [email protected]
- [email protected]_5mCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v1
- [email protected]_5mC_5hmC@v1
- [email protected]_6mA@v1
- [email protected]_6mA@v2
- [email protected]
- [email protected]_4mC_5mC@v1
- [email protected]_4mC_5mC@v2
- [email protected]_4mC_5mC@v3
- [email protected]_5mCG_5hmCG@v1
- [email protected]_5mCG_5hmCG@v2
- [email protected]_5mCG_5hmCG@v3
- [email protected]_5mC_5hmC@v1
- [email protected]_5mC_5hmC@v2
- [email protected]_5mC_5hmC@v3
- [email protected]_6mA@v1
- [email protected]_6mA@v2
- [email protected]_6mA@v3
- [email protected]
- [email protected]_5mCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mCG_5hmCG@v2
- [email protected]
- [email protected]_5mC@v2
- [email protected]_5mCG_5hmCG@v2
- [email protected][email protected]
- [email protected]_5mC_5hmC@v1
- [email protected]_6mA@v2
- [email protected]_6mA@v3
- [email protected]
- [email protected]_5mCG_5hmCG@v1
- [email protected]_5mC_5hmC@v1
- [email protected]_6mA@v1
- [email protected]_6mA@v2
- [email protected]
- [email protected]_4mC_5mC@v1
- [email protected]_4mC_5mC@v2
- [email protected]_4mC_5mC@v3
- [email protected]_5mCG_5hmCG@v1
- [email protected][email protected]
- [email protected]_5mCG_5hmCG@v3
- [email protected]_5mC_5hmC@v1
- [email protected][email protected]
- [email protected]_5mC_5hmC@v3
- [email protected]_6mA@v1
- [email protected]_6mA@v2
- [email protected]_6mA@v3
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected][email protected]
- [email protected]_5mCG_5hmCG@v0
- [email protected]
- [email protected][email protected]
- [email protected]_5mCG_5hmCG@v0
- [email protected]
- [email protected][email protected]
- [email protected]_5mCG_5hmCG@v0
- [email protected]
- rna002_70bps_fast@v3
- rna002_70bps_hac@v3
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]_m6A@v1
- [email protected]_m6A_DRACH@v1
- [email protected]_pseU@v1
- [email protected]
- [email protected]_inosine_m6A@v1
- [email protected]_m5C@v1
- [email protected]_m6A_DRACH@v1
- [email protected]_pseU@v1
- [email protected]
- [email protected]_m6A_DRACH@v1
- [email protected]
- [email protected]_m6A@v1
- [email protected]_m6A_DRACH@v1
- [email protected]_pseU@v1
- [email protected]
- [email protected]_inosine_m6A@v1
- [email protected]_m5C@v1
- [email protected]_m6A_DRACH@v1
- [email protected]_pseU@v1

Note: Please ensure that you have the necessary NVIDIA drivers and the NVIDIA Container Toolkit installed to utilize GPU acceleration.