From 98d18ee1484d5839e55854628227d1c089c94513 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?B=C3=A9r=C3=A9nice=20Batut?= Date: Fri, 26 Jan 2024 16:19:31 +0100 Subject: [PATCH 1/2] Fix few things and add zenodo to ARG detection tutorial --- .../tutorials/amr-gene-detection/tutorial.md | 46 +++++++++++++------ .../workflows/main-workflow-test.yml | 6 +-- 2 files changed, 35 insertions(+), 17 deletions(-) diff --git a/topics/genome-annotation/tutorials/amr-gene-detection/tutorial.md b/topics/genome-annotation/tutorials/amr-gene-detection/tutorial.md index f438f6b08212c9..3ec88a396f8db6 100644 --- a/topics/genome-annotation/tutorials/amr-gene-detection/tutorial.md +++ b/topics/genome-annotation/tutorials/amr-gene-detection/tutorial.md @@ -2,23 +2,24 @@ layout: tutorial_hands_on title: Identification of AMR genes in an assembled bacterial genome -zenodo_link: 'https://zenodo.org/record/4534098' +zenodo_link: 'https://zenodo.org/record/10572227' questions: - Which resistance genes are on a bacterial genome? - Where are the genes located on the genome? objectives: -- Assess presence of antimicrobial resistance genes -- Perform a species identification and MLST typing -- Search for resistance genes on the assembly -- Find a gene on your genome using Prokka + JBrowse +- Run a series of tool to assess the presence of antimicrobial resistance genes (ARG) +- Get information about ARGs +- Visualize the ARGs and plasmid genes in their genomic context time_estimation: 2h key_points: -- Annotation with Prokka is very easy +- staramr is a powerful tool to predict ARGs and plasmid genes +- Visualization of the ARGs and plasmid genes in their genomic context helps to make sense of the data tags: - illumina - amr - one-health subtopic: prokaryote +level: Introductory contributions: authorship: @@ -48,7 +49,7 @@ Antimicrobial resistance (AMR) is a global phenomenon with no geographical or sp AMR gene content can be assessed from whole genome sequencing to detect known resistance mechanisms and potentially identify novel mechanisms. -To illustrate the process to identify AMR gene in a bacterial genome, we take an assembly of a bacterial genome generated by following a [bacterial genome assembly tutorial]({% link topics/assembly/tutorials/mrsa-illumina/tutorial.md %}) from data produced in "Complete Genome Sequences of Eight Methicillin-Resistant *Staphylococcus aureus* Strains Isolated from Patients in Japan" {% cite Hikichi_2019 %}. +To illustrate the process to identify AMR gene in a bacterial genome, we take an assembly of a bacterial genome (KUN1163 sample) generated by following a [bacterial genome assembly tutorial]({% link topics/assembly/tutorials/mrsa-illumina/tutorial.md %}) from data produced in "Complete Genome Sequences of Eight Methicillin-Resistant *Staphylococcus aureus* Strains Isolated from Patients in Japan" ({% cite Hikichi_2019 %}). > Methicillin-resistant *Staphylococcus aureus* (MRSA) is a major pathogen > causing nosocomial infections, and the clinical manifestations of MRSA @@ -143,14 +144,31 @@ To identify AMR genes in contigs, tools like ABRicate or staramr ({% cite bharat > > 1. How many genomes are there? > 2. Has the genome failed or passed the quality? Why? - > 3. What are the predicted AMR drug resistances? + > 3. What are the predicted AMR drug resistances? Are they similar to the one found for KUN1163 in [Table 1](https://journals.asm.org/doi/10.1128/mra.01212-19#tab1) in {% cite Hikichi_2019 %}? > 4. What is the MLST scheme? Does that correspond to the expected species and genus? > > > Inspect the staramr output > > > > 1. There is one genome (1 line) > > 2. The genome has failed the quality (column 2), because the genome length is not within the acceptable length range (last column). - > > 3. amikacin, gentamicin, tobramycin, spectinomycin, erythromycin, azithromycin, penicillin, tetracycline + > > 3. We can summarize starmr output and Table 1 in {% cite Hikichi_2019 %}: + > > + > > Antibiotic name | Abbreviation | staramr | {% cite Hikichi_2019 %} + > > --- | --- | --- | --- + > > Amikacin | | Yes | + > > Azithromycin | | Yes | + > > Cefazolin | CEZ | | Yes + > > clindamycin | CLDM | | Yes + > > Erythromycin | EM | Yes | Yes + > > Gentamicin | GM | Yes | Yes + > > Imipenem | IPM | | Yes + > > Levofloxacin | LVFX | | Yes + > > Oxacillin | MPIPC | | Yes + > > Penicillin | | Yes | + > > Spectinomycin | | Yes | + > > Tetracycline | | Yes | + > > Tobramycin | | Yes | + > > > > 4. The scheme is saureus, so *Staphylococcus aureus* (given the [scheme genus map](https://github.com/tseemann/mlst/blob/master/db/scheme_species_map.tab)), which is coherent with MRSA > {: .solution} > @@ -300,7 +318,7 @@ This table can not be used directly in JBrowse. It first needs to be transformed > 2. **source**: The algorithm or procedure that generated the feature. This is typically the name of a software or database. > 3. **type**: The feature type name, like "gene" or "exon". In a well structured GFF file, all the children features always follow their parents in a single block (so all exons of a transcript are put after their parent "transcript" feature line and before any other parent transcript line). In GFF3, all features and their relationships should be compatible with the standards released by the Sequence Ontology Project. > 4. **start**: Genomic start of the feature, with a 1-base offset. This is in contrast with other 0-offset half-open sequence formats, like BED. -> 5. **end**: Genomic end of the feature, with a 1-base offset. This is the same end coordinate as it is in 0-offset half-open sequence formats, like BED.[citation needed] +> 5. **end**: Genomic end of the feature, with a 1-base offset. This is the same end coordinate as it is in 0-offset half-open sequence formats, like BED. > 6. **score**: Numeric value that generally indicates the confidence of the source in the annotated feature. A value of "." (a dot) is used to define a null value. > 7. **strand**: Single character that indicates the strand of the feature. This can be "+" (positive, or 5'->3'), "-", (negative, or 3'->5'), "." (undetermined), or "?" for features with relevant but unknown strands. > 8. **phase**: phase of CDS features; it can be either one of 0, 1, 2 (for CDS features) or "." (for everything else). See the section below for a detailed explanation. @@ -399,14 +417,14 @@ To get information about the coverage of the contigs and genes, we map the reads > 1. {% tool [Import](upload1) %} the reads (after quality control) from another history, from [Zenodo]({{ page.zenodo_link }}) or from Galaxy shared data libraries: > > ``` -> {{ page.zenodo_link }}/files/DRR187559_1.fastq.gz -> {{ page.zenodo_link }}/files/DRR187559_2.fastq.bz2 +> {{ page.zenodo_link }}/files/DRR187559_after_fastp_1.fastq.gz +> {{ page.zenodo_link }}/files/DRR187559_after_fastp_2.fastq.gz > ``` > > 2. {% tool [Bowtie2](toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.5.0+galaxy0) %} with the following parameters: > - *"Is this single or paired library"*: `Paired-end` -> - {% icon param-file %} *"FASTA/Q file #1"*: Forward reads -> - {% icon param-file %} *"FASTA/Q file #2"*: Reverse reads +> - {% icon param-file %} *"FASTA/Q file #1"*: `DRR187559_after_fastp_1.fastq.gz` +> - {% icon param-file %} *"FASTA/Q file #2"*: `DRR187559_after_fastp_2.fastq.gz` > - *"Will you select a reference genome from your history or use a built-in index?"*: `Use a genome from the history and build index` > - {% icon param-file %} *"Select reference genome"*: Contig file > - *"Save the bowtie2 mapping statistics to the history"*: `Yes` diff --git a/topics/genome-annotation/tutorials/amr-gene-detection/workflows/main-workflow-test.yml b/topics/genome-annotation/tutorials/amr-gene-detection/workflows/main-workflow-test.yml index 1bd4fd8eae5bdd..68d03ba9dfebca 100644 --- a/topics/genome-annotation/tutorials/amr-gene-detection/workflows/main-workflow-test.yml +++ b/topics/genome-annotation/tutorials/amr-gene-detection/workflows/main-workflow-test.yml @@ -3,15 +3,15 @@ job: contigs: class: File - location: + location: https://zenodo.org/record/10572227/files/DRR187559_contigs.fasta filetype: fasta forward_reads: class: File - location: https://zenodo.org/record/4534098/files/DRR187559_1.fastqsanger.bz2 + location: https://zenodo.org/record/10572227/files/DRR187559_after_fastp_1.fastq.gz filetype: fastqsanger.bz2 reverse_reads: class: File - location: https://zenodo.org/record/4534098/files/DRR187559_2.fastqsanger.bz2 + location: https://zenodo.org/record/10572227/files/DRR187559_after_fastp_2.fastq.gz filetype: fastqsanger.bz2 outputs: stararm_detailed_summary: From 93b66ccca88d40c09e781722e5d07ea8bda7b446 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?B=C3=A9r=C3=A9nice=20Batut?= Date: Fri, 26 Jan 2024 16:20:31 +0100 Subject: [PATCH 2/2] Reorganize the ARG detection learning pathway --- learning-pathways/amr-gene-detection.md | 42 ++++++++++++++----------- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/learning-pathways/amr-gene-detection.md b/learning-pathways/amr-gene-detection.md index 2a4b808b038130..ace9b81efda96b 100644 --- a/learning-pathways/amr-gene-detection.md +++ b/learning-pathways/amr-gene-detection.md @@ -2,7 +2,7 @@ layout: learning-pathway title: Detection of AMR genes in bacterial genomes description: | - Learn how to identify AMR genes in bacterial genomes + This learning path aims to teach you the basic steps to detect and check Antimicrobial resistance (AMR) genes in bacterial genomes using Galaxy. type: use tags: [amr, bacteria, microgalaxy, one-health] @@ -11,7 +11,6 @@ editorial_board: funding: - abromics - pathway: # - section: "Module 1: Introduction" # description: | @@ -20,33 +19,38 @@ pathway: # # - name: introduction # # topic: genome-annotation - - section: "Module: AMR gene detection in bacterial isolates (short reads)" +# - section: "Module: Taxonomy assignation" +# description: | +# Taxonomic assignation is useful in AMR detection to check contamination and confirm species +# tutorials: +# - name: taxonomy +# topic: ecology + + - section: "Module: Assembly" description: | - + Assembly is a major step in the process to detect AMR genes as it combines sequenced reads into contigs, longer sequences where it will be easier to identify genes and in particular AMR genes tutorials: - name: mrsa-illumina topic: assembly - - name: amr-gene-detection - topic: genome-annotation - -# - section: "Module 3: AMR gene detection in bacterial isolates (long reads)" -# description: | -# -# tutorials: # - name: mrsa-nanopore # topic: assembly -# - name: amr-gene-detection -# topic: genome-annotation -# -# - section: "Module 4: AMR gene detection in bacterial isolates (long and short reads)" +# - name: hybrid-assembly +# topic: assembly + +# - section: "Module: Genome annotation" # description: | -# +# The generated contigs can be then annotated to detect genes, potential plasmid, etc. This will help the AMR gene detection process, specially the verification and the visualization # tutorials: -# #- name: hybrid-assembly -# # topic: assembly -# - name: amr-gene-detection +# - name: bacterial-genome-annotation # topic: genome-annotation + - section: "Module: AMR gene detection" + description: | + + tutorials: + - name: amr-gene-detection + topic: genome-annotation + - section: "Recommended follow-up tutorials" tutorials: - name: pathogen-detection-from-nanopore-foodborne-data