Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mulitmapping reads dropped resulting in empty output files using GRCh37 #77

Open
nick-phillips opened this issue Apr 2, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@nick-phillips
Copy link

Description of the bug

I ran the pipeline using genome GRCh37 and found no integration sites in my samples, despite a large number of reads aligned to virus. Tracing the intermediate output, I saw that empty files were written in the insertion_site_candidates.nf module.

I believe the issue occurs here:

(re.match("chr[\dMXY]+$", row["chrA"]) is not None) ^ (re.match("chr[\dMXY]+$", row["chrB"]) is not None)

Using --genome GRCh37, the Ensembl reference contigs do not contain chr, so everything is dropped. I am rerunning now with GRCh38 to verify that I can detect integration sites.

Command used and terminal output

$ nextflow nf-core/viralintegration --input <sample_sheet.csv> --outdir <output_path> --genome GRCh37 -profile singularity

Relevant files

No response

System information

No response

@nick-phillips nick-phillips added the bug Something isn't working label Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant