Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run the Wilcoxon test for each of the taxa ranks #1745

Open
mailasb opened this issue Apr 19, 2024 · 0 comments
Open

Run the Wilcoxon test for each of the taxa ranks #1745

mailasb opened this issue Apr 19, 2024 · 0 comments

Comments

@mailasb
Copy link

mailasb commented Apr 19, 2024

Hi, I'm comparing a set of biological samples that are classified into two groups of interest called "Status" (1 and 0). I had the following phyloseq object:

mirl_phyloseq_final
phyloseq-class experiment-level object
otu_table() OTU Table: [ 8229 taxa and 42 samples ]
sample_data() Sample Data: [ 42 samples by 6 sample variables ]
tax_table() Taxonomy Table: [ 8229 taxa by 6 taxonomic ranks ]

Then I agglomerated it by family:
mirl_phyloseq_final.family=tax_glom (mirl_phyloseq_final, taxrank="Family")

mirl_phyloseq_final.family
phyloseq-class experiment-level object
otu_table() OTU Table: [ 364 taxa and 42 samples ]
sample_data() Sample Data: [ 42 samples by 6 sample variables ]
tax_table() Taxonomy Table: [ 364 taxa by 6 taxonomic ranks ]

And then merged it with psmelt:
melted.f=psmelt (mirl_phyloseq_final.family)
So, I got a data.frame (melted.f) with the samples, the found ASVs, the abundance of each one, and the taxonomic classification agglomerated by family that looks like this:
abundance

If I want to calculate the differential abundance between the two groups and run Wilcoxon for a family, I do the following:
wilcox.test (Abundance~Status, data=melted.f[which(melted.f$Family=="Mycoplasmataceae"),]).
And I obtain a p-value.
My intention is to run the Wilcoxon test to calculate differential abundance between the two groups of samples, for all families (in a loop or something like) this and then store those p-values in a data frame with the name of each family.

Any idea how I can do it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant