Replies: 3 comments
-
And the number of variants in View Gene Breakdown should also be a non-redundant count |
Beta Was this translation helpful? Give feedback.
-
This is not a bug, it is a misunderstanding of how seqr counts compound hets, and I think a fundamental misunderstanding of what a compound het actually is. We count each compound het pair as one "variant" as that is the unit of analysis you need to be doing. This is not a situation where you should be looking at the variants individually, assuming this is the search you intended to do. The fact that these pairs may or may not share underlying variants does not change the total count. We export the duplicates because you really should be considering them as pairs and not as individual variants if you are doing a compound het analysis, but each gets its own row because thats how downloads work, theres no way to like put the rows side by side in a file or visually group them in any way. Similarly, for the gene breakdown we do the count of genes that can be considered, so each pair counts the gene for 2 even though technically thats redundant. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your explanation @hanars! Seems I have misunderstood the use of the term "variant" by seqr, and incorrectly assumed it meant "nucleotide sequence variation" in the same sense as e.g dbSNP. However I still find it a little confusing for the term "variant" to mean one thing on the search results page and another in the Gene Breakdown e.g my example has "Found 2 variants" on the results page (referring to the number of compound het pairs) while the Gene Breakdown table says #Variants is 4 (referring to number of underlying variants). |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
In variant count results there seems to be a problem with counting comp het pairs a bit like in #2539
To illustrate: using the Seqr_Demo project do a variants search specifying i) all families ii) recessive restrictive inheritance and iii) location gene TTN. The results say "Found 2 variants". Inspection shows there are two comp het pairs in the results, each of which includes the variant 2-178730621-T-A. Downloading the results in xlsx format gives a spreadsheet with 4 data rows, two of which are duplicates i.e. there are actually three distinct variants. So is the number of variants in this result 2, 3 or 4? I think of the result as 2 diplotypes, comprised of 3 distinct variants. I suggest the results page summary ("Found 2 variants") should be reworded and that the content of the spreadsheet/tabular downloads should be non-redundant.
Link to page(s) where bug is occurring
https://seqr.broadinstitute.org/variant_search/results/a630381d02eda1f98211e13fddc171e7?page=1&sort=xpos
Scope of the bug
Are you experiencing this issue in all projects? Within specific families within one project?
Screenshots
If applicable, add screenshots to help explain your problem.
Beta Was this translation helpful? Give feedback.
All reactions