Skip to content

Releases: dask-contrib/dask-deltatable

Dask-deltatable v0.3.1

24 Jul 17:16
Compare
Choose a tag to compare

This version contains a patch that fixes a problem when reading datasets on a distributed cluster.

Dask-deltatable v0.3

14 Jul 12:39
Compare
Choose a tag to compare

New Features and Enhancements

  • More efficient Dask Graph generation (#24)
  • Transactional write support for append-only write operations with to_deltalake (#29)
  • Reader now supports partition pruning to only load files that match the provided filters (#30)
  • DAT reader acceptance testing against spark generated data (#47)

Breaking changes

Dask and delta-rs integeration

14 Oct 07:21
68dce7f
Compare
Choose a tag to compare

This release builds a wrapper around the Rust package called delta-rs and uses dask for parallel reading.

Features:

  1. Reads the parquet files based on delta logs parallelly using the dask engine
  2. Supports all three filesystems like s3, azurefs, gcsfs
  3. Supports some delta features like
    • Time Travel
    • Schema evolution
    • parquet filters
      • row filter
      • partition filter
  4. Query Delta commit info - History
  5. vacuum the old/ unused parquet files
  6. load different versions of data using DateTime.

DeltaTable reader using Dask

13 Sep 17:41
Compare
Choose a tag to compare
Pre-release

DeltaTable reader using Dask

  1. Reads delta table parallelly using dask
  2. As an Ability to read from different filesystems like S3, Azurefs, gcsfs.
  3. Supports some delta features like
    - Time Travel
    - Schema evolution
    - parquet filters like row and partition filters.