-
Notifications
You must be signed in to change notification settings - Fork 142
add MetaBinner #881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add MetaBinner #881
Conversation
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.3.2. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
|
@nf-core-bot fix linting |
|
Now ready for review, will resolve that conflict with ro-crate -metadata.json soon, but that wont impact the functional code at all. |
jfy133
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @prototaxites that the module is too complex at the moment.
Most of the steps should be in a CONCOCT-like workflow, and the the two python scripts and metabinnner itself should definitely be modules.
The other reason I would prefer to split this up is because it will make it easier to debug.
I'm not too happy about having to have custom commands and scripts (not your fault, just design of the tool) and I want to decouple these potentially more brittle parts from the tool itself.
Co-authored-by: James A. Fellows Yates <[email protected]>
|
Alright, split the module into 4 modules and added a subworkflow. |
|
Ready for another round of reviews, maybe @jfy133 & @prototaxites again? |
dialvarezs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestions from my side.
With the subworkflow implementation is definitively easier to follow.
Co-authored-by: Diego Alvarez S. <[email protected]>
Co-authored-by: Diego Alvarez S. <[email protected]>
|
@nf-core-bot fix linting |
|
I'll run the pipeline as usual before merging this PR to confirm I didnt break anything, but our cluster is a little over-used by me right now so it takes some time. |
prototaxites
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good @d4straub, a few small comments from me
conf/modules.config
Outdated
| ] | ||
| ] | ||
| ext.prefix = { "${meta.assembler}-MetaBinner-${meta.id}" } | ||
| ext.min_contig_size = { params.min_contig_size < 1000 ? "1000" : "${params.min_contig_size}" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be ext directives, or (if this is mandatory) should it just be an input to the module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those arent mandatory, since they have defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it would be mandatory in the sense that if there was no default, the tool wouldn't run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the tool needs that min_contig_size. The process has default 1000 in case there is no ext.min_contig_size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case I would suggest for clarity that it should be an optional val input to the module, and you set a default in the same way. That means that all of the mandatory inputs for the module are set in the same place.
Following the logic from Mahesh here: nf-core/proposals#69
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| zcat $kmer > kmer_profile.csv | ||
| # create coverage profile in Metabinner format | ||
| zcat ${depth} | awk '{if (\$2>${min_contig_size}) print \$0 }' | cut -f -1,4- > coverage_profile.tsv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine for now, but if we consider in future to put this module in the modules repo, we should move this elsewhere to make the module more reusable
prototaxites
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (assuming tests pass!) - but we definitely need to work on a suitable test data set!
|
Yes, suitable datasets would be great. Its not only for testing itself, but also implementing tools that do not run on current test data is troublesome. |
|
Test I'll retry, potentially its a sorting issue. Edit: no, that seems to be consistent, investigating... |
|
@dialvarezs you also saw this in the verison bump PR right? |
|
Ah great than I dont need to investigate :) |
|
It's ready now @d4straub! |
|
Thanks everybody!!!! |
Add MetaBinner, related to #874
Needs #907 to function properly.
This integrates into the pipeline an altered version of the module
METABINNERand the scriptcreate_metabinner_bins.pyfrom https://github.com/hzi-bifo/mag.Issues:
-profile test_fullto produce any bins (but that is error is ignored)test_fulldata works as detailed below--skip_comebin false --skip_metabinner falsefailed with those binners) therefore deactivated in all CI testsSuccessful test was with bacass
test_fullfiles:where the samplesheet contained:
PR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).