Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6 - Vizbin fragmented mock bin dataset #75

Open
kmhandley opened this issue Aug 31, 2024 · 2 comments
Open

6 - Vizbin fragmented mock bin dataset #75

kmhandley opened this issue Aug 31, 2024 · 2 comments

Comments

@kmhandley
Copy link

I think there is way too much going on RE VizBin. We really only need the mock data exercise, but with fragmentation. I found it really confusing that we are using an unfragmented mock community dataset. You can't do anything with bin1, although it's supposed to be near complete, and it's unclear what the lesson is from that. I suggest adding the fragmented the mock bin dataset?

@kmhandley
Copy link
Author

What was the point of the mock community exercise? I ended up losing bin 1, splitting bin 2 in half (sans coverage data), and gaining no real difference to bins 3 and 4. Was the point that bin2 is a combination of two strains???

@JSBoey
Copy link
Collaborator

JSBoey commented Sep 1, 2024

I played around with a few options and I thought this mock set would be a good candidate for lesson simplification because of bin 2's split (CheckM marked it as contamination instead of strain heterogeneity, but it is possible that it needs more data to confirm strain heterogeneity). Regardless of fragmentation, the clusters remain distinct.

Checking the fragmented mock set shows that bin 1 becomes a cloud so I think you're right we should go with that. Not sure how we're going to teach contig-fragment reconciliation? Given that the assembly was pretty fragmented to begin with and contig fragments tended to cluster together (which is a good thing I guess?). Perhaps the option here is to check that contig fragments are not shared between drawn clusters, pull contigs from concatenated bins into newly drawn bins, then CheckM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants