Releases: iterative/datachain
Releases · iterative/datachain
0.3.6
What's Changed
- add retry locks to SQLiteDatabaseEngine execute_str by @mattseddon in #333
- Mutate cannot modify existing column by @EdwardLi-coder in #306
- Mutate can rename columns by @srini047 in #312
- Handle carriage return to support progress bar in logs by @amritghimire in #326
New Contributors
Full Changelog: 0.3.5...0.3.6
0.3.5
What's Changed
- Fix: Standardize union behavior between db implementations by @mattseddon in #304
- Adding schema param to
from_records
by @ilongin in #248 - Fix: Support all column types in SignalSchema.from_column_types by @dreadatour in #319
- Fix: use default delimiter to flatten columns by @shcheklein in #330
Deps
- Bump pdfplumber from 0.11.3 to 0.11.4 by @dependabot in #323
- remove nltk pin by @mattseddon in #332
- move msgpack to core dependencies by @mattseddon in #335
Misc
- Use free GitHub Actions workers whenever possible by @0x2b3bfa0 in #276
- fix test_union_different_column_order by @mattseddon in #324
New Contributors
- @0x2b3bfa0 made their first contribution in #276
Full Changelog: 0.3.4...0.3.5
0.3.4
0.3.3
What's Changed
- Optimize table copy and save step by @dreadatour in #278
- add benchmark for running an actual DataChain query by @skshetry in #188
- Adding In-Memory DataChain Option by @dtulga in #283
- remove erroneous skip_if_not_sqlite calls by @mattseddon in #302
- Added generator function to create dataset out of bucket listing by @ilongin in #260
- Move
fashion_product_images
tutorial todatachain-examples
by @mnrozhkov in #307 - Split Studio tests in CI by @dreadatour in #308
- Bump mypy from 1.10.1 to 1.11.1 by @dependabot in #239
- have file_stem accept a full path by @mattseddon in #284
Full Changelog: 0.3.2...0.3.3
0.3.2
What's Changed
- add example script smoke tests by @mattseddon in #199
- test huggingface pipeline example by @mattseddon in #264
- use D replace "DataChain" by @EdwardLi-coder in #235
- fix(to_pandas): handle empty datachain in to_pandas (and show) by @shcheklein in #241
- feat(column): add regexp match by @shcheklein in #224
- handle nan and inf float values by @dberenbaum in #249
- added Colab links to Getting Started by @volkfox in #275
- update readme about Mistral by @EdwardLi-coder in #270
- readme update 2 by @dmpetrov in #267
- fix unstructured-text example by @mattseddon in #277
- Bump pdfplumber from 0.11.1 to 0.11.3 by @dependabot in #282
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #279
- fixing an issue with JSON fields named as Python reserved words and updating README by @volkfox in #287
- readme - json-pairs by @dmpetrov in #288
- remove notebooks that have been moved to datahchain-examples by @mattseddon in #295
- chore: Deserialize only file signals in get_file_signals by @amritghimire in #305
- Implement database default values by @dreadatour in #296
New Contributors
- @EdwardLi-coder made their first contribution in #235
- @dependabot made their first contribution in #282
Full Changelog: 0.3.1...0.3.2
0.3.1
What's Changed
- Fix typo in
filter
method docstings by @mnrozhkov in #250 - Skip if not SQLite Improvements by @dtulga in #254
- Autodetect Studio branch by @dreadatour in #253
- Autodetect Studio branch fix for 'main' branch by @dreadatour in #257
- Removing
metastore
argument fromClient.parse_url()
by @ilongin in #256 - Autodetect Studio branch fix for 'main' branch by @dreadatour in #258
- Parallel UDF optimizations by @dreadatour in #211
Full Changelog: 0.3.0...0.3.1
0.3.0
0.2.18
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #238
- Adding
DataChain.column(...)
and fixing functions and types by @ilongin in #226
Full Changelog: 0.2.17...0.2.18
0.2.17
What's Changed
- Update readme by @dmpetrov in #233
- Arrow nrows fix by @dberenbaum in #221
- only combine final step for limit by @mattseddon in #230
- Fix renaming object or normal signal with
.mutate()
by @ilongin in #217 - Fixing too many files open, and adding reconnect by @dtulga in #229
Full Changelog: 0.2.16...0.2.17
0.2.16
What's Changed
- improve efficiency of examples by @mattseddon in #214
- fix select then distinct chain by @mattseddon in #213
- rename DataChain's create_empty to from_records by @mattseddon in #215
- do not modify datachain max limit in show by @mattseddon in #225
- Rename cleanup_temp_tables to cleanup_tables in warehouse and catalog by @amritghimire in #218
Full Changelog: 0.2.15...0.2.16