Releases: iterative/datachain
Releases · iterative/datachain
0.6.0
What's Changed
- Add string replace function by @shcheklein in #508
- Add column types to from_csv to override auto inference by @shcheklein in #506
Full Changelog: 0.5.1...0.6.0
0.5.1
What's Changed
- fix: print each statement on a separate line when on debug mode by @skshetry in #479
- Merge datachain.query.udf into datachain.lib.udf and clean-up by @rlamy in #483
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #485
- Parquet Import+Export with SignalSchema by @dtulga in #480
- use returning in get_next_ids by @mattseddon in #484
- Adding Streaming CSV Export by @dtulga in #488
- Remove code duplication: UDFBase._parse_grouped_rows() by @rlamy in #490
- Don't allow mappers to skip rows by @rlamy in #491
- Make datachain queries atomic when exception occurs by @amritghimire in #494
Full Changelog: 0.5.0...0.5.1
0.5.0
What's Changed
- Split DatasetQuery from DataChain by @rlamy in #459
- remove legacy shadow attribute by @mattseddon in #478
Full Changelog: 0.4.0...0.5.0
0.4.0
0.3.20
0.3.19
What's Changed
- Reintroduce and update test_udf_after_limit() by @rlamy in #458
IndexedFile
->ArrowRow
by @dberenbaum in #445- assert each example has some output instead of stdout and stderr by @mattseddon in #468
- query: remove compat for executing last query expression by @skshetry in #449
- Introduce DatasetVersionNotFoundError in errors by @amritghimire in #461
- use official github action for uv and
uv build
by @skshetry in #470
Full Changelog: 0.3.18...0.3.19
0.3.18
What's Changed
- Remove obsolete UDF code by @rlamy in #452
- added embeddings/gen example by @tibor-mach in #362
- update pytest-servers to 0.5.7 by @mattseddon in #454
- Introduce telemetry in datachain by @amritghimire in #411
- Replace
UniqueId
withFile
by @rlamy in #450 - Auto load json cols by @dberenbaum in #444
New Contributors
- @tibor-mach made their first contribution in #362
Full Changelog: 0.3.17...0.3.18
0.3.17
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #451
- remove legacy udf decorator by @mattseddon in #438
- Remove storage from dataset query and refactor related codebase by @ilongin in #367
Full Changelog: 0.3.16...0.3.17
0.3.16
What's Changed
- Move 'join' SQL implementation to warehouse by @dreadatour in #409
Full Changelog: 0.3.15...0.3.16
0.3.15
What's Changed
- Add resolve files by @EdwardLi-coder in #313
- unskip test_udf_parallel by @mattseddon in #432
- fix last modified comparison in resolve file test by @mattseddon in #436
- Refactor
Client.parse_url()
by @ilongin in #435 - Set stream for nested file signals by @dberenbaum in #443
- Read arrow files from cache by @dberenbaum in #442
- Auto-detect huggingface datasets when reading tabular data by @dberenbaum in #398
- Add
datachain.lib.tar.process_tar()
generator by @rlamy in #440 - Fix storage dependencies by @ilongin in #421
Full Changelog: 0.3.14...0.3.15