Releases: iterative/datachain
Releases · iterative/datachain
0.2.13
What's Changed
- DataChain.from_storage: add last_modified and is_latest to the columns by @skshetry in #165
- fix for using new column from
.mutate()
in.order_by()
by @ilongin in #171 - Renaming
File.write()
toFile.save()
by @ilongin in #172 - storage: index as a dir if no glob by @shcheklein in #108
Docs
- first shot at LLM eval tutorial by @volkfox in #145
- Update README.rst by @volkfox in #161
- docs: update README.rst by @eltociear in #168
Maintenance
- skip mypy hook on pre-commit.ci by @skshetry in #164
- ci: disable azure and gs remote tests on macOS by @skshetry in #174
- ci: run s3 tests on Windows, be more careful while skipping by @skshetry in #175
- fix test for ch wrt datetime precision by @skshetry in #169
- Adding tests for exporting image files and
File.write()
by @ilongin in #149
New Contributors
- @eltociear made their first contribution in #168
Full Changelog: 0.2.12...0.2.13
0.2.12
What's Changed
- Python API to manage the dataset registry by @dreadatour in #29
- cli: hide subcommands from the listing by @skshetry in #79
- datachain: rename include_sys kwarg to sys by @skshetry in #69
- Adding
DataChain.export_files(...)
by @ilongin in #30 - Update cv tutorial:
fashion_product_images
by @mnrozhkov in #62 - Add and clean up docstrings in datachain api by @dberenbaum in #63
- docs: fix invalid python code inside docstrings by @skshetry in #85
- Hide traceback for xfails in Studio test runs by @rlamy in #87
- Rename UDF to UDFStep for clarity, and remove from root namespace by @rlamy in #88
- Fix mutate() by @dmpetrov in #78
- update pytest-servers to 0.5.5 by @mattseddon in #94
- Remove vendored-code-specific folders by @dtulga in #95
- Rename repository references to datachain by @dtulga in #93
- do not overwrite version with None in DatasetQuery constructor by @mattseddon in #92
- always include sys signals by @skshetry in #81
- Add more UniqueId fields by @rlamy in #90
- Added more generalize
SignalsSchema.;get_signals()
method instead ofget_file_signals(...)
by @ilongin in #86 - Added input params to
distinct()
by @ilongin in #96 - Fix for
order_by
with sub signals by @ilongin in #82 - Remove legacy signals in from_storage() by @rlamy in #72
- Updates to examples by @dberenbaum in #77
- More docs updates by @dberenbaum in #100
- Add 'update' param to DataChain.from_storage method by @dreadatour in #99
- Fix repository reference in Notebook by @dtulga in #105
- fix(ux): remove reference to DatasetQuery by @shcheklein in #104
- datachain: implement to_parquet by @skshetry in #97
- File refactor by @dberenbaum in #102
- fixing regressions from switching to ModelStore.add() by @volkfox in #109
- add ModelStore to top level imports by @dmpetrov in #112
- add truncate option to show and update default width of output by @mattseddon in #116
- merge/join: exclude sys signals by @skshetry in #120
- Added
descending
parameter toDataChain.order_by(...)
by @ilongin in #122 - remove get_value() from DataModel by @dmpetrov in #119
- Add file modes for binary/text by @dberenbaum in #107
- remove docstring from DataModel.pydantic__init_subclass by @skshetry in #123
- Examples cleanup by @dberenbaum in #111
- rename ModelStore.add() to register() by @dmpetrov in #113
- datachain: generalize data access functions into collect(), and collect_flatten by @skshetry in #121
- Add nrows for partial parsing of csv/parquet by @dberenbaum in #124
- Update index.md by @volkfox in #128
- Picture for getting started by @volkfox in #127
- moving pic to the right place by @volkfox in #131
- cleanup signal refs in examples by @dberenbaum in #129
- cleanup api reference index by @dberenbaum in #130
- Fix for text and images files export by @ilongin in #135
- update computer vision quick start example by @mattseddon in #136
- update computer vision image example by @mattseddon in #139
- Huggingface test updates and bug fix by @dberenbaum in #140
- Readme update by @dmpetrov in #133
- readme: fix link to image by @dmpetrov in #143
- Update badge by @skshetry in #144
- don't depend on datachain from PATH to exec processes by @skshetry in #118
- dc: try to fix dataset_stats for DataChain.from_storage() generated dataset by @skshetry in #151
New Contributors
- @dreadatour made their first contribution in #29
- @mnrozhkov made their first contribution in #62
Full Changelog: 0.2.11...0.2.12
0.2.11
What's Changed
- cleanup model store/registry by @dberenbaum in #74
- slice nested signals by @dberenbaum in #75
- To pandas - hierarchical multi header by @dmpetrov in #22
- Use cloudpickle for parallel UDF processing by @dtulga in #65
Full Changelog: 0.2.10...0.2.11
0.2.10
What's Changed
- Support
object_name
in allfrom_
formats by @dberenbaum in #14 - minor cleanup of readme and docs by @dberenbaum in #53
- run SAAS tests against datachain by @mattseddon in #47
- switch codecov badges to datachain repo by @mattseddon in #46
- update tests.yml by @mattseddon in #58
- Pydantic heaven by @dmpetrov in #45
- File() and UDF refactoring by @rlamy in #56
- Renaming (on top of Pydantic heaven) by @dmpetrov in #48
- fix imports and create datachain.torch by @dberenbaum in #60
- add missing torch dependencies test by @mattseddon in #59
- Implement sys feature, and rename id/random columns by @skshetry in #28
New Contributors
Full Changelog: 0.2.9...0.2.10
0.2.9
0.2.8
0.2.7
What's Changed
- Optimize clip tests by @dberenbaum in #15
- remove image top level import by default by @shcheklein in #31
New Contributors
- @dberenbaum made their first contribution in #15
- @shcheklein made their first contribution in #31
Full Changelog: 0.2.6...0.2.7
0.2.6
What's Changed
- Update README.rst by @volkfox in #8
- pydantic_to_feature(): support enum type & nested lists by @dmpetrov in #5
- Update README.rst by @volkfox in #17
- add top module exports by @mattseddon in #23
- Added fixture to clean session before each test by @ilongin in #25
New Contributors
Full Changelog: 0.2.5...0.2.6
0.2.5
What's Changed
- Initial DataChain Commit by @dtulga in #1
- Remove extra README.md by @dtulga in #3
- remove references to pandas optional dependencies (does not exist) by @mattseddon in #9
- Fixing tests to work for CH DB by @ilongin in #13
New Contributors
- @dtulga made their first contribution in #1
- @mattseddon made their first contribution in #9
- @ilongin made their first contribution in #13
Full Changelog: https://github.com/iterative/datachain/commits/0.2.5