-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter: Expand subsampling docs #1425
Conversation
5dfc2f3
to
4137df9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These docs have been needed for so long, so thank you for making them a reality, @victorlin. With the exception of some minor typos that @jameshadfield pointed out, you could easily merge this now and it would be a big help to users. I made a few comments below that mostly attempt to clarify the content for new users.
As a thought experiment, it would be interesting to see how we'd implement these same |
5e6569a
to
0a902ff
Compare
12f53fc
to
d7b187d
Compare
I've resolved all conversations above. Will merge on any approval. |
b938525
to
5696adc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the clear and thoughtful docs! It will be nice to be able to point users to these guides 🙏
I found one more typo and left some other non-blocking comments, but I think good to merge whenever you think it's ready.
- Use section headings. - Remove references to zika-tutorial. It's largely independent of that and prone to getting out of sync. - Use --output-sequences and --output-metadata. - Fix indentation.
dc6c3f6
to
08aa13c
Compare
(preview)
Description of proposed changes
Expand subsampling docs with a guide on how to implement multi-pass subsampling¹.
¹ internal subsampling doc definition of "multi-pass subsampling": Subsampling done as multiple calls to a subsampling tool, where intermediate subsamples are created and joined together to create a final/combined sample. It is used to work around limitations on what can be done in a single pass.
Related issue(s)
Checklist
docs/guides/bioinformatics
: Add guide on filtering and subsampling docs.nextstrain.org#192--exclude
/--include
(ref)