Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update Simpleaf modules, subworkflow #424

Merged
merged 43 commits into from
Feb 27, 2025
Merged

update Simpleaf modules, subworkflow #424

merged 43 commits into from
Feb 27, 2025

Conversation

DongzeHE
Copy link
Member

@DongzeHE DongzeHE commented Jan 22, 2025

Reopen #361 after updating simpleaf central modules. See this PR. I have tested using a 10x 500 dataset. Once the modules' PR is merged, we can start merging this PR

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/scrnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@DongzeHE DongzeHE requested a review from fmalmeida January 22, 2025 17:41
@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.1.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@DongzeHE DongzeHE requested a review from grst January 22, 2025 17:43
Copy link
Member

@grst grst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor things

@an-altosian
Copy link
Contributor

Hi @grst ,

I think I am pretty happy with the code now. Interestingly, although I did not touch the code for other aligners, all CI tests except those for simpleaf failed.

I tested my changes locally and everything worked. We can discuss linting and the output structure now.

The current output layout is as the following. The biggest change is I removed the alevinqc folder and exported the alevinqc report to the directory of each sample.

`-- simpleaf
    |-- Sample_X
    |   |-- simpleaf_qc_report_Sample_X.html
    |   |-- simpleaf_quant
    |   |   |-- af_map
    |   |   |   |-- map.rad
    |   |   |   |-- map_info.json
    |   |   |   `-- unmapped_bc_count.bin
    |   |   `-- af_quant
    |   |       |-- alevin
    |   |       |   |-- quants.h5ad
    |   |       |   |-- quants_mat.mtx
    |   |       |   |-- quants_mat_cols.txt
    |   |       |   `-- quants_mat_rows.txt
    |   |       |-- collate.json
    |   |       |-- featureDump.txt
    |   |       |-- generate_permit_list.json
    |   |       |-- map.collated.rad
    |   |       |-- permit_freq.bin
    |   |       |-- permit_map.bin
    |   |       |-- quant.json
    |   |       `-- unmapped_bc_count_collated.bin
    |   `-- versions.yml
    |-- Sample_Y
    |   |-- simpleaf_qc_report_Sample_Y.html
    |   |-- simpleaf_quant
    |   |   |-- af_map
    |   |   |   |-- map.rad
    |   |   |   |-- map_info.json
    |   |   |   `-- unmapped_bc_count.bin
    |   |   `-- af_quant
    |   |       |-- alevin
    |   |       |   |-- quants.h5ad
    |   |       |   |-- quants_mat.mtx
    |   |       |   |-- quants_mat_cols.txt
    |   |       |   `-- quants_mat_rows.txt
    |   |       |-- collate.json
    |   |       |-- featureDump.txt
    |   |       |-- generate_permit_list.json
    |   |       |-- map.collated.rad
    |   |       |-- permit_freq.bin
    |   |       |-- permit_map.bin
    |   |       |-- quant.json
    |   |       `-- unmapped_bc_count_collated.bin
    |   `-- versions.yml
    `-- mtx_conversions
        |-- Sample_X
        |   |-- Sample_X_raw_matrix.h5ad
        |   |-- Sample_X_raw_matrix.sce.rds
        |   `-- Sample_X_raw_matrix.seurat.rds
        |-- Sample_Y
        |   |-- Sample_Y_raw_matrix.h5ad
        |   |-- Sample_Y_raw_matrix.sce.rds
        |   `-- Sample_Y_raw_matrix.seurat.rds
        |-- combined_raw_matrix.h5ad
        |-- combined_raw_matrix.sce.rds
        `-- combined_raw_matrix.seurat.rds

Please let me know what you think! We are getting close!

@an-altosian
Copy link
Contributor

I think I addressed all your comments. As this is my first PR to scrnaseq and I made many major changes, before we merge it, can we invite more reviewers? It will be great if other maintainers can go through the changes, especially the document part.

Thanks,
Dongze

@grst
Copy link
Member

grst commented Feb 11, 2025

Thanks for the updates!
Sure, let's first try @nictru!

@grst grst requested a review from nictru February 11, 2025 07:17
@fmalmeida
Copy link
Contributor

I cannot promise, but I can try to find some time to review.
And, many thanks for the PR, @DongzeHE

Copy link
Contributor

@fmalmeida fmalmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there,
Thanks for the work on it.

I have added a few sincere questions and some comments for changes (if y'all agree) :)

{assert new File( "${outputDir}/results_simpleaf/simpleaf/Sample_X/simpleaf_quant/af_quant/alevin/quants_mat.mtx" ).exists()},
{assert new File( "${outputDir}/results_simpleaf/simpleaf/Sample_X/simpleaf_quant/af_quant/alevin/quants.h5ad" ).exists()},
{assert new File( "${outputDir}/results_simpleaf/simpleaf/mtx_conversions/Sample_Y/Sample_Y_raw_matrix.h5ad" ).exists()},
{assert new File( "${outputDir}/results_simpleaf/simpleaf/Sample_Y/simpleaf_quant/af_quant/alevin/quants_mat_cols.txt" ).exists()},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is any of these alevin files now possible to have as snaps with the new version?
I remember before they could vary the sorting but not sure about the newest version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For mtx and its column and row names, unfortunately they can still vary because of parallelization. For the h5ad file, what I can do is sorting it in the mtx_to_h5ad module to make the order fixed. Do you think it is necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Generally, I do not think is necessary. But, if it would be possible to have it being part of the snaps, it would surely add robustness.

@grst , do you think this should be here or is the PR already big enough and better to have another?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean having a snapshot is obviously better than not having one, but I won't insist on it.

Copy link
Contributor

@nictru nictru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job in general - just a few remarks :)

@an-altosian
Copy link
Contributor

OK I think I have address all comments but two:

  1. Exposing other cell filtering strategies: thread
  2. testing: thread

For 1, I can add two parameters and implement the logic, not a big deal.

For 2, I realized that I did not change the test file name so simpleaf was not tested. I ran tests locally, everything worked well, but a weird bug jumped out in GitHub Actions:

Local test log

nf-test test tests/main_pipeline_simpleaf.nf.test --ci

? nf-test 0.9.2
https://www.nf-test.com
(c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr

nf-test runs in CI mode.

Test Workflow main.nf

  Test [bd8e613a] 'test-dataset_simpleaf_aligner' PASSED (124.819s)


SUCCESS: Executed 1 tests in 125.814s

GitHub Actions workflow error:
https://github.com/nf-core/scrnaseq/actions/runs/13277650102/job/37070051243?pr=424#step:9:199

I could not reproduce this error locally. Any suggestions how I can address it? Could you download the artifact of the failed job so that I can jump into it?

@an-altosian
Copy link
Contributor

an-altosian commented Feb 12, 2025

So it turns out that the error comes from this line in simpleaf, caused by the the internal mtx to h5ad conversion (this line) where pola-rs encountered an empty featureDump.txt (this line).

It is strange because this file, generated by alevin-fry in this line, should at least have its header. The logic here is, Simpleaf first asks alevin-fry to generate this file, then read this file and add it into the h5ad output. So, this file should definitely be there.

In some runs, this file was there but I got different md5sums of sorted h5ad files. This also doesn't make sense because, although the columns and rows can swap, the counts of a specific gene in a specific cell should be consistent.

@rob-p - Do you have any idea what happened here?

@fmalmeida
Copy link
Contributor

Super. I do not think I have any other comment, besides the one in the nf-tests which would be good to have but not necessary.
So I am now approving.
Thanks for the great work here 😄

@an-altosian
Copy link
Contributor

Hi @grst I could not figure out how to make my PR pass the CI workflow. As I posted on nf-core Slack channels and my comments above, I think it is a bug of Nextflow + GitHub Actions. Do you have any idea how we can trace the bug down?

@apeltzer
Copy link
Member

I reran the entire thing now multiple times locally and couldnt find any issue with it. Thus merging this, blame me later for that.

@apeltzer apeltzer merged commit 9ac08c0 into nf-core:dev Feb 27, 2025
22 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants