Write dimensions as uint64 in Python #556
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
When testing generating IVF_PQ backwards compatibility data I noticed that IVF_FLAT was failing:
This was after the numpy 2 update (#434), which changed how casts work. To fx it we specifically write
dimensions
asnp.uint64
.PS. We would have caught this during the release, but I got lucky.
Testing
generate_data.py foo
and then rununit_backwards_compatibility.cc
fine.Note
If we wanted to change from
dimensions = np.int64(schema.domain.dim(0).domain[1]) + 1
todimensions = np.uint64(schema.domain.dim(0).domain[1]) + 1
, it would cause MANY errors. One example is this code:Which we'd need to change to:
Because before this,
/opt/homebrew/anaconda3/envs/TileDB-Vector-Search-3/lib/python3.9/site-packages/tiledb/array.py
would promote the types tonp.float64
:Which would result in:
We can fix with the manual cast.
Note that beyond this, any math we would do with
dimensions
would need to be cast. This is becausepromoted_dtype = np.promote_types(type(start), type(stop))
withuint64
andint
results infloat64
. But withint64
andint
, it goes toint64
. Andfloat64
makes many things related to TileDB arrays crash.So this is kind of a hack to leave Python operating with
dimensions
asint
/int64
, but it is quick and all the existing tests continue to pass.