From aca500cc32a02a192d86e95cd1e5fa4f38aa74a0 Mon Sep 17 00:00:00 2001 From: Daniel Cameron Date: Thu, 4 Mar 2021 14:51:19 +1100 Subject: [PATCH 1/3] Added joint calling recommendation https://github.com/hartwigmedical/hmftools/issues/128 --- Readme.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/Readme.md b/Readme.md index 7c2fc094..dc72a04b 100644 --- a/Readme.md +++ b/Readme.md @@ -165,6 +165,20 @@ At present, command line valiation is performed independently of which steps are Just specify multiple BAMs on the command line. GRIDSS will perform joint calling and provide a per-BAM breakdown of support. +### Should I do joint calling or run each sample individually? + +Joint calling of related samples is strongly recommended for the following reasons: + +- GRIDSS performs joint assembly across all samples. +Joint calling has higher coverage of shared variants thus resulting in more reliable assembly of that variant. +Furthmore joint calling allows for sensivity detection of variants that are present subclonally (or at low coverage) that would not be detected if called individually. +**Joint calling should always be used for related samples** (e.g. tumour/normal or trio calling). Joint calling will ensure that a common variant near the single-sample threshold of detection will be reliably reported as a shared variant. This is not the case if the calling were done individually. Note that this particular behaviour is not specific to GRIDSS and is common to all variant callers (hence the joint calling support in many callers). + +- GRIDSS performs joint variant calling across all samples. +Determining whether two SV calls in two different VCFs are actually the same call is non-trivial. +Imprecise calls are especially problematic since the coordinates may differ between the VCF, or a call may be precise in one VCF and not in the other. +A good example of why reconciling SV calls is so problematic is the case where call A (chrX:1-99->chrY:1-99) overlaps call B (chrX:50-149->chrY:1-99), call B overlaps call C (chrX:100-199->chrY:1-99), but A does not overlap C at all. Joint calling obviates this step. + ### How do I perform tumour/normal somatic variant calling? Jointly call on all samples from the patient. From cf826b2d426c8a3f155e02692ca61d6851a7e462 Mon Sep 17 00:00:00 2001 From: Daniel Cameron Date: Thu, 4 Mar 2021 14:54:54 +1100 Subject: [PATCH 2/3] Updated joint calling FAQ --- Readme.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/Readme.md b/Readme.md index dc72a04b..1eaf228f 100644 --- a/Readme.md +++ b/Readme.md @@ -167,14 +167,15 @@ Just specify multiple BAMs on the command line. GRIDSS will perform joint callin ### Should I do joint calling or run each sample individually? -Joint calling of related samples is strongly recommended for the following reasons: +**Joint calling should always be used for related samples** (e.g. tumour/normal or trio calling). +Joint calling will ensure that a common variant near the single-sample threshold of detection will be reliably reported as a shared variant. +This is not the case if the calling were done individually. +Note that this particular behaviour is not specific to GRIDSS and is common to all variant callers (hence the joint calling support in many callers). -- GRIDSS performs joint assembly across all samples. -Joint calling has higher coverage of shared variants thus resulting in more reliable assembly of that variant. -Furthmore joint calling allows for sensivity detection of variants that are present subclonally (or at low coverage) that would not be detected if called individually. -**Joint calling should always be used for related samples** (e.g. tumour/normal or trio calling). Joint calling will ensure that a common variant near the single-sample threshold of detection will be reliably reported as a shared variant. This is not the case if the calling were done individually. Note that this particular behaviour is not specific to GRIDSS and is common to all variant callers (hence the joint calling support in many callers). +Joint calling allows for sensivity detection of variants that are present subclonally (or at low coverage) that would not be detected if called individually. + +GRIDSS performs joint assembly then reports a per-sample breakdown. Joint calling has higher coverage of shared variants thus resulting in more reliable assembly of that variant. -- GRIDSS performs joint variant calling across all samples. Determining whether two SV calls in two different VCFs are actually the same call is non-trivial. Imprecise calls are especially problematic since the coordinates may differ between the VCF, or a call may be precise in one VCF and not in the other. A good example of why reconciling SV calls is so problematic is the case where call A (chrX:1-99->chrY:1-99) overlaps call B (chrX:50-149->chrY:1-99), call B overlaps call C (chrX:100-199->chrY:1-99), but A does not overlap C at all. Joint calling obviates this step. From 4298c0fa0d52f56d5ef1066262413333497c2943 Mon Sep 17 00:00:00 2001 From: Daniel Cameron Date: Fri, 5 Mar 2021 15:38:37 +1100 Subject: [PATCH 3/3] Update VIRUSBreakend_Readme.md --- VIRUSBreakend_Readme.md | 1 + 1 file changed, 1 insertion(+) diff --git a/VIRUSBreakend_Readme.md b/VIRUSBreakend_Readme.md index f1ee5dbf..ee78727d 100644 --- a/VIRUSBreakend_Readme.md +++ b/VIRUSBreakend_Readme.md @@ -109,6 +109,7 @@ Each viral integration should have 2 integration breakpoints (one for the start, The key differentiator of VIRUSBreakend is the ability to detect and classify integration sites in repetative sequences such as centromeres. Due to the repetative nature of these region, such integration sites cannot be unambigously placed in the host genome. In such cases, the mapq encoded in the `BEALN` field will be 0 and the field may contain multiple candidicate integration sites. +Integration sites in which the reported position is ambiguous have a `LOW_MAPQ` FILTER applied. The `INSRM` field contains the repeat sequences identifed in the integration site host sequences. These annotations can be used to classify ambigous integration sites.