Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bam_sort_core] merging from 40 files and 20 in-memory blocks. Error running the pipeline #108

Open
cuilina2019 opened this issue Dec 19, 2024 · 6 comments

Comments

@cuilina2019
Copy link

Dear @jfnavarro,

I also encounter error during running the pipeline. I changed the version HTSeq, but it dosen't work.

The report error:

[bam_sort_core] merging from 40 files and 20 in-memory blocks...
Error running the pipeline
returned a result with an error set

The last several lines of log file:
INFO:STPipeline:# Unmatched: 152319 [0.1330010938608468%]
INFO:STPipeline:Starting annotation 2024-12-19 11:18:42.502306
ERROR:STPipeline:Error during annotation. HTSEQ execution failed

@jfnavarro
Copy link
Owner

How big is your input data? memory specs where you are running the pipeline?

@cuilina2019
Copy link
Author

cuilina2019 commented Dec 21, 2024

​The input data includes FASTQ files with sizes of R1: 7.4G and R2: 2.2G. We executed the pipeline using 20 threads on our HPC platform. Could the memory have caused the error?

The following is the code:
st_pipeline_run.py --output-folder output/ --ids combine_barcode.round2round1_index1_index2.v6_big_25to97_9to97.txt --ref-map STAR_index --ref-annotation Homo_sapiens.GRCh38.112.chr.gtf --expName test --htseq-no-ambiguous --verbose --log-file log.txt --demultiplexing-kmer 5 --threads 20 --temp-folder /output/tmp/ --no-clean-up --umi-start-position 16 --umi-end-position 26 --demultiplexing-overhang 0 --min-length-qual-trimming 20 R2_2.extract.fq.gz R2_1.extract.fq.gz

@jfnavarro
Copy link
Owner

It's probably related to some incompatibility of pysam and htseq packages. I need to find some time to update requirements.txt with range of versions that are compatible and work. What versions of pysam and htseq is your environment using?

@cuilina2019
Copy link
Author

pysam 0.22.1
HTSeq 2.0.9

@jfnavarro
Copy link
Owner

Try older versions of both like pysam < 0.15 and htseq < 0.14 and also how much memory do you allocate in the node that you run the pipeline?

@cuilina2019
Copy link
Author

Thank you for the suggestion. We will try using older versions of both pysam and htseq, specifically pysam < 0.15 and htseq < 0.14. Regarding memory allocation, we typically allocate 60GB in the node where we run the pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants