Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for generalized UniFrac (again) #528

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

fangly
Copy link

@fangly fangly commented Sep 19, 2015

Hi Joey,
Thanks for the constructing feedback. Here is my response:
1/ Many files were affected because I ran formatR on all of them to clean-up the uneven mixture of tabs and spaces in code indentations. I have now reverted these changes, except for the distance calculation file. It may be a good idea to run formatR yourself, though...
2/ I made GUniFrac a suggested dependency, as you suggested.
3/ You rightly point out that it would be better to implement the generalized UniFrac calculation in phyloseq instead of calling an external package. I do agree. However, I think that my commit adds functionality to phyloseq as-is. However, it seems that GUniFrac is an orphaned package, and I would suggest that you take its distance calculation code almost as-is (using appropriate licensing and acknowledgements, of course), inform the authors (out of courtesy), and ultimately try to integrate it nicely into phyloseq. As you see, this is both a political decision and technical. You're obviously in charge of the politics in this project. On the technical side, I do not have a good-enough understanding of the algorithm to do it myself.
Cheers,
Florent

@joey711
Copy link
Owner

joey711 commented Mar 20, 2016

@fangly
Thanks for your effort here. @benjjneb and I are planning to sneak in a performance improvement to the fast-UniFrac internals in phyloseq so that it is closer to the published form.

I haven't seen a lot of users of G-UniFrac, which might explain why the package is orphaned. That is not to say that distance itself isn't useful, and so I'm not opposed to supporting it in phyloseq, but there is a bandwidth issue.

Ideally we would add the necessary abstraction/tweak to the planned fast-unifrac update. However, I don't want to make this an additional constraint, because the deadline for package updates for the next BioC release is in two weeks or so.

Finally, the arbitrary format changes complicate versioning and my comprehension of what is being changed in the PR, so I can't simply merge the PR at the moment.

Let's circle back after the fast-unifrac update, and see how we can work this in.

Thanks for the contribution and enthusiasm for phyloseq. Hopefully we can get this supported soon.

joey

@jfq3
Copy link

jfq3 commented Jan 25, 2018

I sometimes use GUnifrac. True the package is orphaned, but only sort of, Jun Chen moved to Minnesota where he and a student wrote a new package (and paper about) including a faster version of GUnifrac. MiSPU is available on CRAN and GitHub (https://github.com/ChongWu-Biostat). Paper at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4872356/
jfq3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants