Releases: TileDB-Inc/TileDB-Vector-Search
Releases · TileDB-Inc/TileDB-Vector-Search
0.11.0
What's Changed
- Add explicit markdown parser for
DirectoryTextReader
by @NikolaosPapailiou in #532 - Update to TileDB Core 2.26.2 by @jparismorgan in #535
- Disable consolidate_update_fragments for TileDB Cloud URIs by @NikolaosPapailiou in #536
- Improve ObjectAPI documentation by @NikolaosPapailiou in #538
- Decouple consolidate and vacuum operations by @NikolaosPapailiou in #539
- Parallelize asset creation and registration by @NikolaosPapailiou in #540
- Add query with driver mode for ObjectIndex by @NikolaosPapailiou in #541
- [automated] Update backwards-compatibility-data for release 0.9.0 by @github-actions in #527
- [automated] Update backwards-compatibility-data for release 0.10.0 by @github-actions in #530
- Improve timers by @jparismorgan in #496
- Add more instructions on building C++ by @jparismorgan in #523
- Support multiple embeddings per object in Object API by @NikolaosPapailiou in #542
- Add ColPali embedding and support for processing PDFs as images by @NikolaosPapailiou in #543
- Add a new read_vector() helper which does a multi-range query by @jparismorgan in #545
- Rename group_uri to index_uri by @jparismorgan in #544
- Refactor group version check by @jparismorgan in #547
- Cleanup unit_ivf_pq_index.cc by @jparismorgan in #548
- Update tdbMatrixMultiRange to support both slices and individual indices by @jparismorgan in #550
- Add start, stop, and temporal_policy to FeatureVector and FeatureVectorArray by @jparismorgan in #549
- Rename nlist to partitions by @jparismorgan in #551
- Fix types in qv_partition_with_scores() and train_no_init() by @jparismorgan in #546
- Various small code cleanups from IVF_PQ OOC work by @jparismorgan in #552
- Fix AVX2 distance function and build with AVX2 if available on host machine by @jparismorgan in #524
- Add
numpy
2 support by @jparismorgan in #434 - Add bioimage search utils and example notebook by @NikolaosPapailiou in #553
- Support out-of-core and distributed IVF_PQ ingestion by @jparismorgan in #531
- Update IVF_PQ to use temp directory pattern to support parallel ingestion by @jparismorgan in #554
- Write dimensions as uint64 in Python by @jparismorgan in #556
- Add IVF_PQ backwards compatibility data generation and testing by @jparismorgan in #555
- Remove warning that IVF_PQ index is not supported by @jparismorgan in #557
Full Changelog: 0.10.0...0.11.0
0.10.3
What's Changed
- Disable consolidate_update_fragments for TileDB Cloud URIs by @NikolaosPapailiou in #536
- Improve ObjectAPI documentation by @NikolaosPapailiou in #538
- Decouple consolidate and vacuum operations by @NikolaosPapailiou in #539
- Parallelize asset creation and registration by @NikolaosPapailiou in #540
- Add query with driver mode for ObjectIndex by @NikolaosPapailiou in #541
- [automated] Update backwards-compatibility-data for release 0.9.0 by @github-actions in #527
- [automated] Update backwards-compatibility-data for release 0.10.0 by @github-actions in #530
- Improve timers by @jparismorgan in #496
- Add more instructions on building C++ by @jparismorgan in #523
- Support multiple embeddings per object in Object API by @NikolaosPapailiou in #542
- Add ColPali embedding and support for processing PDFs as images by @NikolaosPapailiou in #543
- Add a new read_vector() helper which does a multi-range query by @jparismorgan in #545
- Rename group_uri to index_uri by @jparismorgan in #544
- Refactor group version check by @jparismorgan in #547
- Cleanup unit_ivf_pq_index.cc by @jparismorgan in #548
- Update tdbMatrixMultiRange to support both slices and individual indices by @jparismorgan in #550
- Add start, stop, and temporal_policy to FeatureVector and FeatureVectorArray by @jparismorgan in #549
- Rename nlist to partitions by @jparismorgan in #551
- Fix types in qv_partition_with_scores() and train_no_init() by @jparismorgan in #546
- Various small code cleanups from IVF_PQ OOC work by @jparismorgan in #552
- Fix AVX2 distance function and build with AVX2 if available on host machine by @jparismorgan in #524
- Add
numpy
2 support by @jparismorgan in #434 - Add bioimage search utils and example notebook by @NikolaosPapailiou in #553
- Support out-of-core and distributed IVF_PQ ingestion by @jparismorgan in #531
- Update IVF_PQ to use temp directory pattern to support parallel ingestion by @jparismorgan in #554
- Write dimensions as uint64 in Python by @jparismorgan in #556
- Add IVF_PQ backwards compatibility data generation and testing by @jparismorgan in #555
- Remove warning that IVF_PQ index is not supported by @jparismorgan in #557
Full Changelog: 0.10.2...0.10.3
0.10.2
What's Changed
- Update to TileDB Core 2.26.2 by @jparismorgan in #535
Full Changelog: 0.10.1...0.10.2
0.10.1
What's Changed
- Add explicit markdown parser for
DirectoryTextReader
by @NikolaosPapailiou in #532
Full Changelog: 0.10.0...0.10.1
0.10.0
0.9.0
What's Changed
- Fix bug for Index clear_history by @NikolaosPapailiou in #522
- Add open method for Index class by @NikolaosPapailiou in #503
- Fix small C++ warnings around type conversions by @jparismorgan in #520
- Update IVF_PQ to set memory_budget in constructor, support preload feature_vectors and metadata only modes by @jparismorgan in #518
- Update TileDB Core 2.26.0 by @NikolaosPapailiou in #521
Full Changelog: 0.8.1...0.9.0
0.8.1
What's Changed
- Update local-benchmarks script to save results to a new directory during each run by @jparismorgan in #464
- [automated] Update backwards-compatibility-data for release 0.8.0 by @github-actions in #467
- Fix C++ warnings across codebase by @jparismorgan in #461
- Enable more benchmarks in ann-benchmarks script by @jparismorgan in #470
- Optimize distance computations for IVFPQ by @NikolaosPapailiou in #468
- Fix C++ incorrect or ambiguous primitive types by @jparismorgan in #455
- Add C++ sanitizers by @jparismorgan in #471
- Use single shared seed for rng generation in C++ by @jparismorgan in #469
- Make logging utils thread-safe by @jparismorgan in #474
- Add nightly CI which runs sanitizers by @jparismorgan in #476
- Add new test and cleanup C++ code by @jparismorgan in #475
- Fix
counting_sum_of_squares_distance
thread sanitizer error by @jparismorgan in #472 - Parallelize Vamana query by @jparismorgan in #463
- Fix address sanitizer error when storing metadata in destructor by @jparismorgan in #473
- Fix mutex lock error in logging utils by @jparismorgan in #478
- Add option to ann-benchmarks.py to skip benchmarks and leave instance running by @jparismorgan in #477
- Add dump() methods to logging classes by @jparismorgan in #479
- Add more testing for Vamana helpers, add option to skip top_k to improve ingestion speed by @jparismorgan in #481
- Avoid copies in query code by @jparismorgan in #483
- Add Vamana storage spec by @jparismorgan in #484
- Add more comments about IVF parameters
partitions
andnprobe
by @jparismorgan in #487 - IVF_FLAT Distance metric by @cainamisir in #451
- Update matrix to take in a
std::vector
instead ofstd::initializer_list
by @jparismorgan in #492 - Fix flaky IVF_PQ Python test and add more C++ testing around pq-encoding by @jparismorgan in #494
- Fix infinite loop in kmeans by @jparismorgan in #491
- Distance metric integration for vamana, and refactoring of distance metrics by @cainamisir in #460
- Add out-of-core query() support to IVF PQ by @jparismorgan in #485
- IVF_PQ: Remove unused pq_ivf_centroids, remove extra call to train_ivf(), add comments, cleanup code by @jparismorgan in #489
- Refactor logging helpers by @jparismorgan in #497
- Name each benchmark within
local-benchmarks.py
by @jparismorgan in #498 - Add scGPT and scvi embeddings and improve SOMA reader by @NikolaosPapailiou in #501
- Vlad/l2 sumofsquares by @cainamisir in #486
- Update ann-benchmarks.py to connect to running instance by @jparismorgan in #500
- Fix some TODOs after the cloud release by @NikolaosPapailiou in #505
- Various small tdb matrix cleanups by @jparismorgan in #510
- Distance metric small fixes: uninitialized value in IVF_PQ and pass setting during
consolidate_updates()
by @jparismorgan in #508 - Add
fixed_min_triplet_heap
by @jparismorgan in #507 - Update IVF_PQ array names by @jparismorgan in #511
- Check finite and infinite IVF_PQ queries return the same ids and distances, fix
count_intersections()
to not modify inputs, updateread_index_finite()
to return data by @jparismorgan in #509 - Add new plot to local-benchmarks.py showing results from all indexes by @jparismorgan in #512
- DirectoryReader: set text/plain as default mime type if it is not found by @NikolaosPapailiou in #513
- Add option to run local-benchmarks.py with index at a tiledb uri by @jparismorgan in #514
- IVF_PQ re-ranking by @jparismorgan in #502
- Support AWS index URI in
local-benchmarks.py
by @jparismorgan in #516 - Add
k_factor
tolocal-benchmarks.py
by @jparismorgan in #517
Full Changelog: 0.8.0...0.8.1
0.8.0
What's Changed
- Remove use of
set_coords_filter_list
from dense array creation by @jparismorgan in #439 - [automated] Update backwards-compatibility-data for release 0.7.0 by @github-actions in #438
- Tune default ingestion configuration to avoid OOM errors by @NikolaosPapailiou in #440
- Add ids to Python
FeatureVectorArray
by @jparismorgan in #442 - Add
Optional
to Python code that was missing it by @jparismorgan in #443 - For type-erased Python indexes, 1) Don't consolidate parts and ids Arrays 2) Avoid extra Schema open in constructor by @jparismorgan in #444
- Cleanups to
ivf_index()
C++ code, and small cleanups in Python by @jparismorgan in #430 - Support IVF PQ consolidation by storing raw feature vectors and external IDs by @jparismorgan in #447
tdbPartitionedMatrix
will automatically close Array's when done reading by @jparismorgan in #448- Re-enable IVF PQ tests by @jparismorgan in #450
- Save kmeans settings to IVF PQ metadata by @jparismorgan in #452
- Allow setting IVF PQ partitions when re-ingesting, fix IVF PQ object index tests by @jparismorgan in #453
- Avoid creating one temp array for each ingestion work item by @NikolaosPapailiou in #449
- Add Vector Search storage format spec by @NikolaosPapailiou in #456
- Fix markdown format for storage spec by @NikolaosPapailiou in #457
- Update dimensions to be uint64_t in C++ by @jparismorgan in #454
- Configure memory budget for distributed OOC queries by @NikolaosPapailiou in #462
- Add local benchmarking script by @jparismorgan in #459
- Distance metrics integration by @cainamisir in #422
- Close tdbMatrix and tdbMatrixWithIds Array's when we have nothing left to to read by @jparismorgan in #466
- Update to TileDB Core 2.25.0 by @jparismorgan in #465
Full Changelog: 0.7.0...0.8.0
0.7.0
What's Changed
- [automated] Update backwards-compatibility-data for release 0.6.0 by @github-actions in #432
- Remove
apis/python/requirements-py.txt
by @jparismorgan in #433 - Fix bug where we did not set compression filters when creating TileDB Array's in C++ by @jparismorgan in #436
- Update to TileDB Core 2.24.2 by @jparismorgan in #437
Full Changelog: 0.6.0...0.7.0
0.6.0
What's Changed
- Pin numpy to fix Python CI failures by @NikolaosPapailiou in #419
- Enable OOC processing for IVF_FLAT distributed query execution by @NikolaosPapailiou in #418
- Cleanup IVF PQ C++ index code by @jparismorgan in #421
- Add debug info and remove stripping by @dudoslav in #424
- Fix type-erased indexes writing fragments at timestamp=0, thus fixing IVF PQ time travel by @jparismorgan in #425
- Improve benchmark script - add other vector search libraries and download full results by @jparismorgan in #415
- Remove
b_backtrack
from Vamana index by @jparismorgan in #428 - Expose Vamana graph building params by @jparismorgan in #423
- Update to TileDB Core 2.24.1 by @jparismorgan in #431
Full Changelog: 0.5.1...0.6.0