-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong filtering in Alignment metrics #943
base: master
Are you sure you want to change the base?
Wrong filtering in Alignment metrics #943
Conversation
@nh13, I think that you are responsible for these NB's....will have time to take a look at these changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! Some discussion about getSupplementaryAlignmentFlag
is still needed.
@@ -270,7 +268,8 @@ private void collectQualityData(final SAMRecord record, final ReferenceSequence | |||
// NB: for read count metrics, do not include supplementary records, but for base count metrics, do include supplementary records. | |||
|
|||
// If the read isn't an aligned PF read then look at the read for no-calls | |||
if (record.getReadUnmappedFlag() || record.getReadFailsVendorQualityCheckFlag() || !doRefMetrics) { | |||
if (!record.getSupplementaryAlignmentFlag() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this check added getSupplementaryAlignmentFlag
? I thought we wanted to have supplementary alignments add to these metrics (ex. PF_HQ_MEDIAN_MISMATCHES
)? @yfarjoun want to weigh in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the idea is that a supplementary read should not be considered for the first part (namely updating badCycles histogram on unmapped reads) but, it should be considered for the "else" part....I agree with that sentiment, but I also agree with you that it wasn't entirely well thought out, in particular: a supplementary alignment will cause the code to enter the second part even all the other conditions (mapped, passesFilter, refMetrics) are false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, changed if-else condition, sorry for this omission.
@@ -314,7 +313,8 @@ else if (!record.getReadFailsVendorQualityCheckFlag()) { | |||
if (mismatch) hqMismatchCount++; | |||
} | |||
|
|||
if (mismatch || SequenceUtil.isNoCall(readBases[readBaseIndex])) { | |||
if (!record.getSupplementaryAlignmentFlag() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, we can leave this "if" condition for supplementary reads (according to yfarjoun comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is right.. @nh13 what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah no. sorry. this is more complex. Given that we are only looping over the alignment blocks, we need to look at the supplemental reads in order to see the bases....I guess, it's as @nh13 said in the code comment: for metrics that have to do with bases (so the badCycleHistogram is one), we want to see the supplemental reads, but for metrics that have to do with reads, we do not. I think that this means that supplemental reads should be filtered from collectReadData (if supplemental, return) but not from collectQualityData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for clarifying this! Okay, so I want to make sure that I understand right.. :) I can remove all filtering for supplementary from collectQualityData() , and leave check for secondary in acceptRecord() - this will be necessary logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's right. @nh13 ?
@@ -79,9 +79,7 @@ public AlignmentSummaryMetricsCollector(final Set<MetricAccumulationLevel> accum | |||
|
|||
@Override | |||
public void acceptRecord(final SAMRecord rec, final ReferenceSequence ref) { | |||
if (!rec.isSecondaryOrSupplementary()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to include secondary reads as well? I think not. we should probably filter secondary reads here. Even if they are filtered upstream somehow (I don't think that they are) this program should be clear that secondary alignements are filtered out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added check for secondary read here.
@@ -270,7 +268,8 @@ private void collectQualityData(final SAMRecord record, final ReferenceSequence | |||
// NB: for read count metrics, do not include supplementary records, but for base count metrics, do include supplementary records. | |||
|
|||
// If the read isn't an aligned PF read then look at the read for no-calls | |||
if (record.getReadUnmappedFlag() || record.getReadFailsVendorQualityCheckFlag() || !doRefMetrics) { | |||
if (!record.getSupplementaryAlignmentFlag() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the idea is that a supplementary read should not be considered for the first part (namely updating badCycles histogram on unmapped reads) but, it should be considered for the "else" part....I agree with that sentiment, but I also agree with you that it wasn't entirely well thought out, in particular: a supplementary alignment will cause the code to enter the second part even all the other conditions (mapped, passesFilter, refMetrics) are false.
* Fix if-else condition of supplementary reads
@SQ SN:chr6 LN:101 | ||
@SQ SN:chr8 LN:202 | ||
@RG ID:0 SM:Hi,Momma! LB:whatever PU:me PL:ILLUMINA | ||
SL-XAV:1:1:0:700#0/2 137 chr6 1 255 101M * 0 0 NAATTGTTCTNAGTTTCTCGGTTTATGTGCTCTTCCAGGTGGGTAACACAATAATGGCCTTCCAGATCGTAAGAGCGACGTGTGTTGCACNAGTGTCGATC &0::887::::6/646::838388811/679:87640&./2+/-4/28:3,536/4''&&.78/(/554/./02*)*',-(57()&.6(6:(0601'/(,* RG:Z:0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good if this example were more realistic, so that the primary read were only aligned for, say 60 bases and the supplementary read would be aligned for the remaining 41 bases. also, the bases and the qualities should be the same for the read and its supplemental alignment....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PolinaBevad are you going to add a more realistic test?
Description
Comment in AlignmentSummaryMetricsCollector method collectReadData() says: " // NB: for read count metrics, do not include supplementary records, but for base count metrics, do include supplementary records". But supplementary records are pre-filtered at super level in method acceptRecord() in the same class, because only non-secondary and non-supplementary records are accepted issue.
Implementation
According to the issue:
Checklist (never delete this)
Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.
Content
Review
For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests