v23.04.00
What’s Changed
⚠️ Breaking Changes
- Preserve original Dask partitions by default in
[Dataset.to](http://dataset.to/)_parquet
@rjzamora (#254) - Change the location and filename of schema.pbtxt to .merlin/schema.json @edknv (#249)
🐜 Bug Fixes
- Return a dataframe type that matches
reader
passed tofetch_table_data
@oliverholworthy (#287) - add hack to handle tf not recognizing bool dtype in dlpack @jperez999 (#276)
- update numpy version to handle dlpack @jperez999 (#275)
- fix cuda import logic from numba and device memsize @jperez999 (#274)
- change cpu conversion for tf to convert-to-tensor @jperez999 (#271)
- fix gpu numpy conversion offsets @jperez999 (#269)
- Disable strict dtype checking by default @karlhigley (#268)
- Propagate
_unsafe
flag through column constructors properly @karlhigley (#264) - Propagate the
_unsafe
mode flag fromTensorTable
toTensorColumn
@karlhigley (#260) - add import pytest to file @jperez999 (#229)
🚀 Features
- Add
column_type
property toTensorTable
@karlhigley (#283) - Extend mapping of nullable types for pandas @oliverholworthy (#278)
- add 3d tensor support to creating tensor columns @jperez999 (#246)
- Run with import without gpu @jperez999 (#261)
- Check environment supports target device in Dataset constructor @oliverholworthy (#243)
- Support
Dataset
cpu-mode in environment with GPUs that have not been detected @oliverholworthy (#236) - Allow casting a
Dimension
to an integer when min and max are the same @karlhigley (#252) - Add predicate function argument to
select_by_tag
@oliverholworthy (#94) - Add row_group_size argument to Dataset.to_parquet @rjzamora (#218)
- Enable Schema selection using
select_by_tag
with string representation ofTags
enum. @oliverholworthy (#242) - Add Schema
copy
method @oliverholworthy (#240)
🔧 Maintenance
- Update
pull_apart_list
to usepd.concat
instead of deprecatedSeries.append
@oliverholworthy (#291) - Install protobuf version compatible with tensorflow 2.9 for Merlin Models tests @oliverholworthy (#289)
- Add support for from_dlpack with numpy 1.23.0 @oliverholworthy (#284)
- Save schema in old location for backwards compatibility @oliverholworthy (#267)
- Refactor
LocalExecutor
into more discrete steps that can be overridden @karlhigley (#279) - Preserve type of shape dims as ints when re-loading schema from disk @oliverholworthy (#281)
- uses compat everywhere to allow container bypass when gpus not present @jperez999 (#277)
- update numpy version to handle dlpack @jperez999 (#275)
- fix cuda import logic from numba and device memsize @jperez999 (#274)
- migrate compat into a separate folder and separate tf and torch import @jperez999 (#272)
- change cpu conversion for tf to convert-to-tensor @jperez999 (#271)
- compat imports update @jperez999 (#270)
- fix gpu numpy conversion offsets @jperez999 (#269)
- fix configure tf function to id all gpus available @jperez999 (#266)
- migrate configure tensorflow to core, separate has_gpu from compat @jperez999 (#265)
- add 3d tensor support to creating tensor columns @jperez999 (#246)
- Revert #261 and #262 (
merlin.core.compat
changes) @karlhigley (#263) - Run with import without gpu @jperez999 (#261)
- Update
merlin.core.compat
to useHAS_GPU
and add add'l libraries @karlhigley (#262) - Rework DLpack conversion dispatching to allow caching dispatched methods @karlhigley (#259)
- Add an
unsafe
mode toTensorTable
/TensorColumn
(for internal use) @karlhigley (#258) - Make
TensorColumn
shape and dtype properties lazy but memoized @karlhigley (#257) - Bump
dask
,distributed
,fsspec
versions @karlhigley (#201) - Move common steps to run tox env into reusable workflow @oliverholworthy (#247)
- Improve check for array types in
is_list_dtype
@oliverholworthy (#253) - Support cupy and numpy array types in
flatten_list_column_values
@oliverholworthy (#251) - Update
is_list_dtype
to handle additional types @oliverholworthy (#250) - Remove use of HAS_GPU from
dispatch
functions @oliverholworthy (#244) - Change the location and filename of schema.pbtxt to .merlin/schema.json @edknv (#249)
- Add workflow for testing dataloader @oliverholworthy (#186)
- add import pytest to file @jperez999 (#229)
- Add correct job dependency for release in
cpu-packages
@oliverholworthy (#241)