Releases · alexklibisz/elastiknn · GitHub

05 Apr 02:33

0.1.0-PRE10

Introduced a cache for exact similarity queries that maintains deserialized vectors in memory instead of repeatedly
reading them and deserializing them. By default the cache entries expire after 90 seconds.
Fixed a mapping issue that was causing warnings to be printed at runtime. Specifically, the term fields corresponding
to a vector should be given the same name as the field where the vector is stored. A bit confusing, but it works.

Assets 3

04 Apr 15:31

0.1.0-PRE9

Remove the usage of Protobufs at the API level. Instead implemented a more idiomatic Elasticsearch API. Now using c
ustom case classes in scala and data classes in Python, which is more tedious, but worth it for a more intuitive API.
Remove the pipelines in favor of processing/indexing vectors in the custom mapping. The model parameters are defined in
the mapping and applied to any document field with type elastiknn_sparse_bool_vector or elastiknn_dense_float_vector.
This eliminates the need for a pipeline/processor and the need to maintain custom mappings for the indexed vectors.
Implement all queries using custom Lucene queries. This is tightly coupled to the custom mappings, since the mappings
determine how vector hashes are stored and can be queried. For now I've been able to use very simple Lucene Term and
Boolean queries.
Add a "sparse indexed" mapping for jaccard and hamming similarities. This stores the indices of sparse boolean vectors
as Lucene terms, allowing you to run a term query to get the intersection of the query vector against all stored vectors.

Assets 3

29 Feb 13:54

0.1.0-PRE8

Removed the num_tables argument from JaccardLshOptions as it's redundant to num_bands.
Profiled and refactored the JaccardLshModel using the Ann-benchmarks Kosarak Jaccard dataset.
Added an example program that grid-searches JaccardLshOptions for best performance and plots the Pareto front.

Assets 3

15 Feb 19:32

0.1.0-PRE7

Got rid of base64 encoding/decoding in ElastiKnnVectorFieldMapper. This improves ann-benchmarks performance by about 20%.

Assets 3

15 Feb 16:41

0.1.0-PRE6

Improved exact Jaccard performance by implementing a critical path in Java so that it uses primitive int [] arrays instead of boxed integers in scala.

Assets 3

14 Feb 05:09

0.1.0-PRE5

Fixed performance regression.

Assets 3

13 Feb 05:55

0.1.0-PRE4

Client and core library interface improvements.
Added use_cache parameter to KNearestNeighborsQuery which signals that the vectors should only be read once from Lucene and then cached in memory.

Assets 3

08 Feb 20:13

0.1.0-PRE3

Releasing versioned python client library to PyPi.

Assets 3

08 Feb 16:35

0.1.0-PRE2

Releasing versioned elastiknn plugin zip file.

Assets 3