Skip to content

4.beta.4

Pre-release
Pre-release
Compare
Choose a tag to compare
@droazen droazen released this 26 Aug 02:53
· 1857 commits to master since this release

Highlights of this release include fixes to the GATK4 HaplotypeCaller to bring it closer to the output of the GATK3 HaplotypeCaller (although many of these fixes still need to be applied to HaplotypeCallerSpark), fixes for longstanding indexing and CRAM-related bugs in htsjdk, bash tab completion support for GATK commands, and many improvements to Mutect2 and the SV tools.

A docker image for this release can be found in the broadinstitute/gatk repository on dockerhub. Within the image, cd into /gatk then run gatk-launch commands as usual.

Note: Due to our current dependency on a snapshot of google-cloud-java, this release cannot be published to maven central.

Changes in this release:

  • HaplotypeCaller: a number of important updates and fixes to bring it closer to GATK 3.x's output (most of these fixes apply only to HaplotypeCaller, not HaplotypeCallerSpark) (#3519)
    • reduce memory usage of the AssemblyRegion traversal by an order of magnitude
    • create empty pileup objects for uncovered loci internally (fixes occasional gaps between GVCF blocks as well as some calling artifacts)
    • when determining active regions, only consider loci within the user's intervals
    • port some additional changes to the GATK 3.x HaplotypeCaller to GATK4
    • fix bug with handling of the MQ annotation
  • Added bash tab completion support for GATK commands (#3424)
  • Updated to Intel GKL 0.5.8, which fixes bug in AVX detection, which was behaving incorrectly on some AMD systems (#3513)
  • Upgrade htsjdk to 2.11.0-4-g958dc6e-SNAPSHOT to pick up an important VCF header performance fix. (#3504)
  • Updated google-cloud-nio dependency to 0.20.4-alpha-20170727.190814-1:shaded (#3373)
  • Fix tabix indexing bugs in htsjdk, and reenable the IndexFeatureFile tool (#3425)
  • Fix longstanding issue with CRAM MD5 slice calculation in htsjdk (#3430)
  • Started publishing nightly builds
  • Performance improvements to allow MD+BQSR+HC Spark pipeline to scale to a full genome (#3106)
  • Eliminate expensive toString() call in GenotypeGVCFs (#3478)
  • ValidateVariants gvcf memory optimization (#3445)
  • Simplified Mutect2 annotations (#3351)
  • Fix MuTect2 INFO field types in the VCF header (#3422)
  • SV tools: fixed possibility of a negative fragment length that shouldn't have happened (#3463)
  • Added command line argument for IntervalMerging based on GATK3 (#3254)
  • Added 'nio_max_retries' option as a command line accessible option for GATK tools (#3328)
  • Fix aligned PathSeq input getting filtered by WellformedReadFilter (#3453)
  • Patch the ReferenceBases annotation to handle the case where no reference is present (#3299)
  • Honor index/MD5 creation for HaplotypeCaller/Mutect2 bamouts. (#3374)
  • Fix SV pipeline default init script handling (#3467)
  • SV tools: improve the test bam (#3455)
  • SV tools: improved filtering for smallish indels (#3376)
  • Extends BwaMemImageSingleton into a cache, BwaMemImageCache, that can… (#3359)
  • Try installing R packages from multiple CRAN repos in case some are down (#3451)
  • Run Oncotator (optional) in the CNV case WDL. (#3408)
  • Add option to run Spark tests only (#3377)
  • Added a .dockerignore file (#3418)
  • Code cleanup in the sv discovery package (#3361) and fixes #3224
  • Implement PathSeq taxon hit scoring in Spark (#3406)
  • Add option to skip pre-Bwa repartitioning in PSFilter (#3405)
  • Update the GQ after PLs get subset (#3409)
  • Removed the explicit System.exit(0) from Main (#3400)
  • build_docker.sh can run tests again #3191 #3160 (#3323)
  • Minor doc fixes #3173 (#3332)
  • Use ReadClipper in BaseQualityClipReadTransformer (#3388)
  • PathSeq adapter trimming and simple repeat masking (#3354)
  • Add scripts to manage SV spark jobs and copy result (#3370)
  • Output empty VQSLOD tranches in scatterTranches mode if no variant has VQSLOD high enough for requested threshold (#3397)
  • Option to filter short pathogen reference contigs (#3355)
  • Rewrote hapmap autoval wdl (#3379)
  • fixed contamination calculation, added error bars to output (#3385)
  • wrote wdl for Mutect panel of normals (#3386)
  • Turn off tranches plots if no output Rscript is specified (for annotation plots) (#3383)
  • Mutect2 wdls output the contamination (#3375)
  • Increased maximum copy-ratio variance slice-sampling bound. (#3378)
  • Replace --allowMissingData with --errorIfMissingData (gives opposite default behavior as previously) and print NA for null object in VariantsToTable (#3190)
  • docs for proposed tumor-in-normal tool (#3264)
  • Fixed the git version for the output jar on docker automatic builds (#3496)
  • Use correct logger class in MathUtils (#3479)
  • Make ShardBoundaryShard implement Serializable (#3245)