Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding single-read functionality to PROFILE #84

Merged
merged 63 commits into from
Jan 6, 2025
Merged

Conversation

simonleandergrimm
Copy link
Collaborator

@simonleandergrimm simonleandergrimm commented Nov 7, 2024

This PR adds support for single-read sequencing data to the PROFILE stage of the pipeline while maintaining existing paired-end functionality.

Key Changes

  • Added BBDUK_SINGLE process to handle single-end read filtering
  • Added SUBSET_READS_SINGLE and SUBSET_READS_SINGLE_TARGET processes for read subsetting
  • Modified PROFILE and TAXONOMY subworkflows to conditionally use single-end or paired-end processes based on params.read_type
  • Updated workflow to skip BBMerge and Join steps when processing single-end data
  • Extended run_dev_se.nf to now also include the PROFILE subworkflow

Testing

I validated the pipeline changes and compared single vs paired-end results in this notebook: https://data.securebio.org/simons-notebook/posts/2024-10-28-mgs-taxonomy-eval/

I find that single-read results look very similar to paired-end results. The paired-end results of run_dev_se.nf look the same as the results of run.nf, suggesting that the inclusion of single-read functionality doesn't negatively impact paired-end analysis.

@simonleandergrimm simonleandergrimm changed the title Single read profile Adding single-read functionality to PROFILE Nov 7, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonleandergrimm any particular reason you're using BBDUK rather than BBDUK_HITS here? The main pipeline uses the latter.

subworkflows/local/taxonomy/main.nf Show resolved Hide resolved
Copy link
Collaborator

@harmonbhasin harmonbhasin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix these things, also your tests are failing, I think it has to do with the taxonomy workflow.

@@ -6,11 +6,17 @@
| MODULES AND SUBWORKFLOWS |
***************************/

include { BBMERGE } from "../../../modules/local/bbmerge"
include { SUMMARIZE_BBMERGE } from "../../../modules/local/summarizeBBMerge"
if (params.single_end) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonleandergrimm i chatted with Will, he said that it's okay to remove the if statement (we only previously had the if statement because nextflow would throw warnings, but that's been removed now). Please update this file to reflect this (i.e. remove the if statement).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be an empty file? Why is it here?

@@ -7,7 +7,9 @@ params {

// Directories
base_dir = "s3://nao-mgs-wb/test-batch" // Parent for working and output directories (can be S3)
ref_dir = "s3://nao-mgs-wb/index-20241113/output" // Reference/index directory (generated by index workflow)

ref_dir = "s3://nao-mgs-wb/index-20241113/output" // Reference/index directory (generated by index workflow)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the new index now

@willbradshaw
Copy link
Contributor

@simonleandergrimm @harmonbhasin This would be a good time to update the CHANGELOG.

@simonleandergrimm simonleandergrimm deleted the single-read-profile branch December 18, 2024 13:21
@simonleandergrimm simonleandergrimm restored the single-read-profile branch December 18, 2024 13:26
Base automatically changed from single-read-raw-clean to dev December 20, 2024 13:55
@simonleandergrimm
Copy link
Collaborator Author

@willbradshaw Made the requested changes, updated the changelog. I think this is good to go in.

@willbradshaw
Copy link
Contributor

@harmonbhasin are you happy for this to get merged? I don't see anything blocking on my end.

@harmonbhasin
Copy link
Collaborator

@willbradshaw feel free to merge, didn't realize my changes were still requested, apologies for that!

@simonleandergrimm
Copy link
Collaborator Author

simonleandergrimm commented Dec 23, 2024 via email

@willbradshaw willbradshaw merged commit f5ac614 into dev Jan 6, 2025
5 checks passed
@willbradshaw willbradshaw deleted the single-read-profile branch January 6, 2025 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants