Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All mapped reads flagged as "too short" #396

Open
jacopoM28 opened this issue Jan 21, 2025 · 0 comments
Open

All mapped reads flagged as "too short" #396

jacopoM28 opened this issue Jan 21, 2025 · 0 comments

Comments

@jacopoM28
Copy link

jacopoM28 commented Jan 21, 2025

Dear all,

I am writing because I am experiencing some issues with the TADbit tools, where all mapped reads are flagged as either "too close to RES" or "too short". Specifically, I have two HiC libraries generated with Arima-HiC (150bp PE reads), which I have mapped separately in an iterative fashion without specifying any restriction enzyme, using the following commands:

$tadbit map --fastq "$FASTQ.1_R1" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 1 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative

$tadbit map --fastq "$FASTQ.1_R2" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 2 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative

$tadbit map --fastq "$FASTQ.2_R1" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 1 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative

$tadbit map --fastq "$FASTQ.2_R2" --index ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.gem --read 2 \
--renz NONE -C 40 --windows 1:15 1:20 1:25 1:30 1:35 1:40 1:45 1:50 1:55 1:60 1:65 1:70 1:75 -w Large --iterative

After this, I merged all files with:

$tadbit parse -w Large/ --genome ../qmProCava1.cleaned.HiC_ProCav2.curated.1.primary.curated.fa

Finally, for reads filtering, I used:

$tadbit filter -w Large/ -C 10 --apply 1 2 3 4 6 7 9 10
Getting intersection between read 1 and read 2
Get insert size...
  - median insert size = 356.0
  - double median absolution of insert size = 87.0
  - max insert size (when a gap in continuity of > 10 bp is found in fragment lengths) = 1356
   Using the maximum continuous fragment size(1356 bp) to check for pseudo-dangling ends
   Using maximum continuous fragment size plus the MAD (1443 bp) to check for random breaks
identify pairs to filter...
Filtered reads (and percentage of total):

                   Mapped both  :  103,322,115 (100.00%)
  -----------------------------------------------------
   1-               self-circle :    5,101,974 (  4.94%)
   2-              dangling-end :   27,421,527 ( 26.54%)
   3-                     error :   10,421,255 ( 10.09%)
   4-        extra dangling-end :            0 (  0.00%)
   5-        too close from RES :  103,322,116 (100.00%)
   6-                 too short :  103,322,116 (100.00%)
   7-                 too large :            0 (  0.00%)
   8-          over-represented :   74,076,173 ( 71.69%)
   9-                duplicated :   45,211,936 ( 43.76%)
  10-             random breaks :            0 (  0.00%)
    saving to file 0 reads without.

As you can see from the TADbit log, all reads appear to have been flagged as either "too short" or "too close to RES". Do you have any idea if there is something wrong with my commands and/or if I am missing something?

Thanks in advance for your support!

All the best,
Jacopo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant