-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consider adding sourmash gather execution directly to fastgather as postprocessing #107
Comments
I looked into this and it's a bit annoying because |
after #134, I was hoping to provide all gather columns in the branchwater fastmultigather results csv by using the rust I think that's feasible, but might require a little more work in rust core:
Ref:
|
.. hmm, this actually doesn't directly help with |
Two thoughts - I love the details in here about what needs to be done! I think maybe it should be its own issue or set of issues! But, also, my experience with a performance issue in branchwater, see #71, makes me wary here. I don't want to slow fastgather down by accident. I guess I am of two minds: is the only purpose of fastgather to do a full gather, faster? Or are there situations where we might want to take the output of fastgather and NOT do a gather afterwards? OK, I feel dumb for even saying it. But anyway, be wary of performance issues, is all. |
great point! I think I want the full results often enough that it would be useful to enable. If there's a significant performance hit, it might be worth passing a flag to toggle between lightweight and full versions of gather. I'll move the steps above over to a new |
soooo looking at #188, I have a hot take: both fastgather and full this feels a bit hacky, but I think there's opportunity for a version of gather that just calculates the statistics without doing the full search and so on. That is, it should be possible to just take the fastgather output and flesh out the full stats by "believing" the fastgather output - not using it as a picklist, and instead using it as an ordered scaffold. It would presumably be much faster and lower memory too... |
This is becoming quite the tangle of issues and PRs 😅 but I wanted to connect them a bit more by pointing out we've gone this route with ref PR dib-lab/sourmash-slainte#18 for sourmash-slainte workflow, and sourmash-bio/sourmash#2950 for problems revealed in sourmash. |
You know what? I'm going to close this issue. #187 has all the important stuff remaining. |
could run w picklist via Python API
The text was updated successfully, but these errors were encountered: