Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation of Diversity tutorial #4282

Merged
merged 37 commits into from
Aug 1, 2024

Conversation

Sophia120199
Copy link
Contributor

Hi, so this is the draft of a tutorial on how to calculate alpha and beta diversity, I'm looking forward to the review feedback :)

There are still some unsolved issues:

  1. I didn't understand how to interpret the results of the fisher's alpha index calculated by krakentools, so the explanation for this is still missing
  2. the zenodo files with the bracken output as starter input files for the tutorial still need to be created
  3. somehow in the detail box on renyie entropy, the mathematical expression is already visible, when the box is not opened yet, I don't know why
  4. in the preview website, the mathematical expressions of the alpha diversity are not shown correctly since I added the $$ signs

@Sophia120199 Sophia120199 requested a review from a team as a code owner July 13, 2023 17:59
Copy link
Collaborator

@paulzierep paulzierep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small text adjustments and some suggestions for better explanation of krakentools alpha diversity

topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved
topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved

S is number of taxa, n is number of individuals and a is the Fisher's alpha.

**KrakenTools** is a suite of scripts designed to help Kraken users with downstream analysis of Kraken results. The Krakentool **Calculate alpha diversity** offers the possibility to calculate five different alpha diversity indexes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output of bracken consists of all taxonomic levels: K_kingdom, P_phylum, C_class, O_order, F_family, G_genus, and S_species ; hence the alpha diversity is also calculated using all that levels. In think in most cases, alpha diversity is defined as "generally based on the number and relative abundance of taxa at some rank" ...maybe that could be explained as well.
Technically, one could add a filter step before krakentools to achieve that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be added as a detail box for people that uses Kraken output directly and not Braken one

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes a detail box sounds good, still what I said applies for Kraken and Braken

4. Inverse Simpson's diversity
5. Fisher's index

> <hands-on-title>Calculate α diversity with Krakentools</hands-on-title>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could actually use Calculate α diversity with Krakentools for any tool that outputs taxonomy abundances, provided one converts it into the correct table format i.e. assigning column 5 as the abundance, in fact that would for any population data, maybe that could be addressed, since it would allow the usage in a more generic setting.

Copy link
Member

@bebatut bebatut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @Sophia120199. I made some suggestions of changes and also added some comments. We can talk about them tomorrow in our meeting

topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved
topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved
topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved
topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved
topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved
topics/metagenomics/tutorials/diversity/tutorial.md Outdated Show resolved Hide resolved
> Instead of selecting a few measures to describe an assemblage, it is preferable to **present a continuous profile** that depicts diversity or entropy as a function of q (where q ≥ 0). This approach enables a visual comparison of the compositional complexities among multiple assemblages and facilitates the assessment of the evenness in the relative abundance distributions of the assemblages. In practice, the profile is typically plotted for values of q ranging from 0 to q = 3 or 4, beyond which there is usually little change.
>
> ![Parameter q](./images/hill_numbers.png)
> https://www.redalyc.org/journal/5117/511766773011/html/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this link? What does it refer to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats the source of the image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe good to put as a caption then (in parenthesis after the path to the image)

{: .details}


> <details-title>More details on the Hill numbers</details-title>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that go with the Multidimensional metrics?

{: .details}


> <details-title>More details on the Rényi entropy</details-title>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that go with the Multidimensional metrics?

@@ -0,0 +1 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the workflow there? And tests? Thanks

Copy link
Collaborator

@paulzierep paulzierep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment for Calculate beta diversity

Comment on lines 252 to 259
> Some of the key features and functionalities of QIIME 2 include:
> 1. Data Import and Preprocessing: QIIME 2 supports the import of raw sequencing data and performs quality control and data preprocessing steps, such as demultiplexing, quality filtering, and primer removal.
> 2. Taxonomic Assignments: The software enables taxonomic classification of microbial sequences using various algorithms and reference databases.
> 3. Diversity Analysis: QIIME 2 allows users to explore and quantify microbial diversity within and between samples. It provides metrics for alpha diversity (within-sample diversity) and beta diversity (between-sample diversity).
> 4. Community Analysis: Users can investigate the composition and structure of microbial communities, including taxonomic summaries, abundance profiles, and statistical comparisons between groups.
> 5. Phylogenetic Analysis: QIIME 2 supports the construction of phylogenetic trees to infer evolutionary relationships among microbial taxa and perform phylogenetic diversity analysis.
> 6. Statistical Analysis: The software offers a wide range of statistical methods for differential abundance analysis, correlation analysis, multivariate analysis, and other types of statistical tests.
> 7. Visualization: QIIME 2 provides interactive and customizable visualizations to aid in the exploration and interpretation of microbiome data, including heatmaps, bar plots, PCoA plots, and taxonomic trees.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think the most important information would be to state, that Calculate beta diversity with quiime has much more metric, but requires a different input format.

@shiltemann
Copy link
Member

Hi @Sophia120199! Thanks a lot for your tutorial! Just wanted to let you know that we renamed the metagenomics topic to "microbiome", so I have also changed this in your PR. So please update your local branch to pull in these changes before adding more edits :)

@bebatut
Copy link
Member

bebatut commented Jan 16, 2024

@Sophia120199 is not working with us anymore
@paulzierep could you have a look at this PR and see if more changes are needed to merge it? Thanks a lot

@paulzierep
Copy link
Collaborator

sorry I missed that completely, will try to review in the next days ...

@paulzierep
Copy link
Collaborator

Still need to review this, will try my best !

Copy link
Collaborator

@paulzierep paulzierep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know where the test files are ?

---
layout: tutorial_hands_on
title: Calculating α and β diversity from microbiome taxonomic data
zenodo_link: xxx
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need some test data and a workflow for this.

@@ -0,0 +1,437 @@
---
layout: tutorial_hands_on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
layout: tutorial_hands_on
layout: tutorial_hands_on
draft: true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to have it as a draft tutorial?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until I fix it yes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind, I merged too early, lets fix it now

@paulzierep paulzierep merged commit 4e88ecb into galaxyproject:main Aug 1, 2024
1 of 3 checks passed
@bebatut bebatut deleted the Diversity-Tutorial branch August 1, 2024 10:47
@shiltemann
Copy link
Member

shiltemann commented Aug 1, 2024

@paulzierep looks like you merged this while the build test was failing, so this (and any future updates to GTN) will not go live until that is addressed, do you have time to have a look at the faillure or should I?

@shiltemann
Copy link
Member

(I'm on it)

@paulzierep
Copy link
Collaborator

Yes, sorry,, will fix it here: #5191

@shiltemann
Copy link
Member

all good, I pushed a fix directly to main, just waiting to see if it builds now

@paulzierep
Copy link
Collaborator

thanks @shiltemann ; I am fixing the rest here atm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants