-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allele Balance #14
Comments
ADf is a comma separated list of the read counts for each allele on the forward strand. You can sum up the ADf, ADr and ADb values for each allele to get the allele balance. |
Thank you. I am still confused about some of the results, as often they seem to not add up. Take the examples below that I retrieved from a representative marker: FORMAT: GT:GQ:DP:ADf:ADr:ADb SAMPLE 1: 0/0/0/1/1:0:18:2,3:4,1:0,0 It seems like the total depth is 18 and 25, respectively. The sums of the ADf, ADr and ADb fields, however, are not equal to the total depth. I appreciate your attention. |
The total read depth (DP) includes all reads. Some of these may be filtered out for the calculation of ADf, ADr and ADb which are used for variant calling. Update: this is not correct |
Dear Bansal, Then, I did not understand why the DP definition from CRISP VCF output says: "filtered reads only"? This definition generates all the misunderstandings about CRISP output. Thank you! |
I have checked the code and the above definition is correct. DP includes counts for all alleles at the site so it can be greater than the counts for the two alleles output in the VCF. For example, a site with a ref allele = 'A' and variant allele = 'C' can have reads supporting 'G' or 'T'. These reads are included in the DP count. |
Dear Bansal, I checked the BAM files, and your argument about other reads supporting other possible alleles is not valid (see attached the screenshot from IGV). This a clear SNP with only reads supporting REF=C and ALT=T, no reads are supporting the A or G. Therefore, the DP field from CRISP output is showing the total reads depth without filtering. For the ADf, ADr and ADb there are some filters applied in the counts. SAMPLE 1SAMPLE 2When compare the CRISP and FreeBayes outputs, the FreeBayes is giving the correct number of DP and AD fields without filtering. CRISP output1 19545 Chla_rei_1_19545 C T 3470 PASS FreeBayes output1 19545 Chla_rei_1_19545 C T 12110.9 . I want to use the CRISP output to run SNPGenie software. To be able to do this, I need to double-check with you about the DP field and all the ADf, ADr and ADb. Based on my interpretation of CRISP output, I will need to create a code to sum all these fields to replace the DP and AD fields. Then, for the first sample, it will be something like this: The update file will be: Can you confirm if this is correct? Thank you very much for this help. |
Yes, you are correct. I will have to update the code to output the correct DP value. |
Dear Vikas Bansal,
I am trying to estimate the allele balance for each sample genotyped with CRISP. Is there a way to obtain that information from the output?
For example, from a total of 100 reads supporting the call, a heterozygous sample might have 80 reads to the reference allele and 20 reads to the alternative allele.
From the FORMAT field, I see the following options: GT:GQ:DP:ADf:ADr:ADb. The total depth seems to be included, but it is not clear to me how to obtain how many of these reads were relative to each allele.
Thank you!
The text was updated successfully, but these errors were encountered: