Skip to content

4.2.6.0

Compare
Choose a tag to compare
@droazen droazen released this 08 Apr 19:27
· 285 commits to master since this release
3b0bc03

Download release: gatk-4.2.6.0.zip
Docker image: https://hub.docker.com/r/broadinstitute/gatk/

Highlights of the 4.2.6.0 release:

  • Important bug fixes for the joint calling tools (GenotypeGVCFs / GenomicsDB)

    • GATK 4.2.5.0 contained two joint genotyping bugs that are now fixed in GATK 4.2.6.0:
      • GenotypeGVCFs can throw NullPointerExceptions in some cases with many alternate alleles.
      • The expectation-maximization component of the QUAL calculation was disabled, leading to false positive, low quality alleles at some multi-allelic sites.
    • If you are running these tools in 4.2.5.0 we strongly recommend updating to 4.2.6.0
  • Fixed a "Bucket is a requester pays bucket but no user project provided" error that occurred when accessing requester pays buckets in Google Cloud Storage even when the --gcs-project-for-requester-pays argument was specified

    • If you continue to encounter problems accessing requester pays Google Cloud Storage buckets in 4.2.6.0, please let us know by filing a Github issue!
  • Two new tools for the Structural Variation calling pipeline: SVAnnotate and PrintSVEvidence

  • Some fixes to genotype-given-alleles mode in HaplotypeCaller and Mutect2

Full list of changes:

  • Joint Calling (GenotypeGVCFs / GenomicsDB)

    • GATK 4.2.5.0 contained two joint genotyping bugs which are now fixed in 4.2.6.0:
      • GenotypeGVCFs can throw NullPointerExceptions in some cases with many alternate alleles.
        • Fixed in:
          • Fix for NullPointerException when GenomicsDB has more ALT alleles than specified maximum and many GQ0 hom-ref genotypes allow variants to pass the QUAL filter (#7738)
      • The expectation-maximization component of the QUAL calculation was disabled, leading to false positive, low quality alleles at some multi-allelic sites.
        • Fixed in:
          • Fix multi-allelic QUAL calculation and restore some missing ALT annotation data in ReblockGVCFs (#7670)
    • Mention acceptable compressed VCF file extensions in GenomicsDBImport error message (#7692)
  • SV Calling

    • Added a new tool SVAnnotate (#7431)
      • SVAnnotate adds functional annotations for SVs called by GATK-SV (#7431)
    • Added a new tool PrintSVEvidence (#7695)
      • PrintSVEvidence is a tool that can merge any number of files containing one of five types of evidence of structural variation. It's also capable of subsetting regions or samples. It's used to merge evidence from a cohort in the GATK-SV pipeline.
    • Added start/end coordinate validation to SVCallRecord (#7714)
  • HaplotypeCaller / Mutect2

    • Fixed an edge case in HaplotypeCaller where filtered alleles in the vicinity of forced-calling alleles could result in empty calls (#7740)
      • This affects users who run genotype given alleles mode in non-GVCF mode
    • Fixed a bug in HaplotypeCaller and Mutect2 where force-calling alleles were lost upon trimming by placing allele injection after trimming (#7679)
    • Added a debug ``--pair-hmm-results-file` argument that dumps the the exact inputs/outputs of the PairHMM to a file (#7660)
    • Some changes to Mutect2 to support the future Mutect3 (#7663)
      • Added training data for the Mutect3 normal artifact filter
      • Output tensors for Mutect3 as plain text rather than VCF
  • RNA Tools

    • TransferReadTags: a new tool that transfers a read tag from an unaligned bam to the matching aligned bam (#7739).
      • This tool allows us to retrieve read tags that get lost when converting a SAM file to fastqs, then back to SAM (which is necessary if e.g. running fastp to clip adapter bases before alignment).
    • PostProcessReadsForRSEM: a new tool that re-orders and filters reads before running RSEM, which has stringent requirements on the input SAM (https://github.com/deweylab/RSEM) (#7752).
  • Funcotator

    • Added custom VariantClassification severity ordering. (#7673)
      • Users can now customize the severity ratings of the various VariantClassifications using the new --custom-variant-classification-order argument
    • Added logging statements to the b37 conversion process explaining why the automatic b37 conversion does or does not take place on their VCFs (#7760)
  • VariantRecalibrator

    • Added regularization to covariance in GMM maximization step to fix convergence issues in VariantRecalibrator (#7709)
      • This makes the tool more robust in cases where annotations are highly correlated
  • Bug Fixes

    • Fixed a "Bucket is a requester pays bucket but no user project provided" error that occurred when accessing requester pays buckets in Google Cloud Storage even when --gcs-project-for-requester-pays was specified (#7700) (#7730)
    • Fix for the PossibleDeNovo annotation to work without Genotype Likelihoods (#7662)
      • PossibleDeNovo checks each trio's genotype (including parent hom ref genotypes) for likelihoods even though it doesn't actually use the PLs. The PLs can get dropped if GVCFs are reblocked which means this annotation no longer works as expected. This changes the check to look for GQs instead of PLs as the GQs are used as part of the annotation.
    • Fixed a bug with the --mate-too-distant-length in MateDistantReadFilter not being configurable (#7701)
  • GATK Engine

    • Added a new MultiFeatureWalker traversal to the GATK engine (#7695)
    • Removed an ancient, unused option to track unique reads in a LocusIteratorByState (#6410)
  • Miscellaneous Changes

    • Added back the jcenter repository resolver to our gradle build, fixing a "Could not find biz.k11i:xgboost-predictor:0.3.0" error when building GATK from source (#7665)
    • We now properly update the latest tag in the broadinstitute/gatk-nightly Dockerhub repo (#7703)
    • The docker build now only does a git lfs pull on src/main/resources/large (#7727)
    • Install git lfs with --force in the Dockerfile (#7682)
    • Fix WDL generation for MultiVariantWalkers by adding a companion index to the MultiVariantWalker input variant arg (#7689)
    • Added google apps script to automatically update GATK release stats. (#7637)
    • Updated the GATK stats script to be more universally usable (#7759)
    • Added JointCallExomeCNVs to .dockstore.yml and included a note in the WDL (#7719)
  • Documentation

    • Corrected the docs for the --heterozygosity argument in the GenotypeCalculationArgumentCollection (#7661)
  • Dependencies

    • Updated Picard to 2.27.1 (#7766)
    • Updated google-cloud-nio to 0.123.25 (#7730)