Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why subset-bam is not efficient for splitting BAM file based on barcodes #15

Open
kulansam opened this issue Mar 1, 2022 · 1 comment

Comments

@kulansam
Copy link

kulansam commented Mar 1, 2022

Hi,

Thanks for developing a subset-bam software. I would like to split the BAM file (from cell ranger) for each individual cell barcode, which is provided in the filtered_feature matrix folder (barcode.tsv). I have used the following comment in for loop of my code, but it takes more than 6 days for around 4000K cells in multi-threading.

subset-bam_linux --bam filtered_barcodes_sorted.bam --cell-barcodes $line.tsv --cores 15 --out-bam ./filter_cell_individual_bam/$line.bam

Is there any way to speed up this process?

@limin321
Copy link

limin321 commented May 5, 2022

Hi,

Thanks for developing a subset-bam software. I would like to split the BAM file (from cell ranger) for each individual cell barcode, which is provided in the filtered_feature matrix folder (barcode.tsv). I have used the following comment in for loop of my code, but it takes more than 6 days for around 4000K cells in multi-threading.

subset-bam_linux --bam filtered_barcodes_sorted.bam --cell-barcodes $line.tsv --cores 15 --out-bam ./filter_cell_individual_bam/$line.bam

Is there any way to speed up this process?

What I did is to split the barcode.tsv into many txt files, each barcode is one file. Then you can set up running as a batch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants