Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve query performance #47

Merged
merged 12 commits into from
Mar 16, 2024
Merged

Improve query performance #47

merged 12 commits into from
Mar 16, 2024

Conversation

philippgille
Copy link
Owner

@philippgille philippgille commented Mar 16, 2024

We added benchmarks in #46.

Now we used them + CPU and memory profiles gathered with them to improve the performance.

⚠️ This PR only addresses the vector similarity search, not the metadata or full text filtering.

Each individual improvement is a separate commit. The overall improvement is 60-80% reduction in query duration and > 99% reduction in memory allocations:

goos: linux
goarch: amd64
pkg: github.com/philippgille/chromem-go
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                                    │    before     │                after                │
                                    │    sec/op     │    sec/op     vs base               │
Collection_Query_NoContent_100-8       413.7µ ±  4%   109.9µ ±  1%  -73.44% (p=0.002 n=6)
Collection_Query_NoContent_1000-8     2759.4µ ±  0%   536.8µ ±  1%  -80.55% (p=0.002 n=6)
Collection_Query_NoContent_5000-8     12.980m ±  1%   4.985m ± 15%  -61.60% (p=0.002 n=6)
Collection_Query_NoContent_25000-8     66.56m ±  1%   14.97m ± 10%  -77.51% (p=0.002 n=6)
Collection_Query_NoContent_100000-8   282.41m ±  3%   56.50m ± 11%  -79.99% (p=0.002 n=6)
Collection_Query_100-8                 416.7µ ±  2%   110.0µ ±  0%  -73.61% (p=0.002 n=6)
Collection_Query_1000-8               2792.8µ ± 23%   536.8µ ±  0%  -80.78% (p=0.002 n=6)
Collection_Query_5000-8               15.643m ±  1%   4.869m ±  5%  -68.88% (p=0.002 n=6)
Collection_Query_25000-8               78.29m ±  1%   15.01m ±  3%  -80.82% (p=0.002 n=6)
Collection_Query_100000-8             338.54m ±  5%   56.48m ±  4%  -83.32% (p=0.002 n=6)
geomean                                12.97m         3.008m        -76.81%

                                    │     before      │                after                │
                                    │      B/op       │     B/op      vs base               │
Collection_Query_NoContent_100-8      1211.007Ki ± 0%   6.330Ki ± 0%  -99.48% (p=0.002 n=6)
Collection_Query_NoContent_1000-8     12082.16Ki ± 0%   34.83Ki ± 0%  -99.71% (p=0.002 n=6)
Collection_Query_NoContent_5000-8      60394.2Ki ± 0%   162.8Ki ± 0%  -99.73% (p=0.002 n=6)
Collection_Query_NoContent_25000-8    301962.1Ki ± 0%   794.8Ki ± 0%  -99.74% (p=0.002 n=6)
Collection_Query_NoContent_100000-8   1179.510Mi ± 0%   3.057Mi ± 0%  -99.74% (p=0.002 n=6)
Collection_Query_100-8                1211.006Ki ± 0%   6.329Ki ± 0%  -99.48% (p=0.002 n=6)
Collection_Query_1000-8               12082.11Ki ± 0%   34.83Ki ± 0%  -99.71% (p=0.002 n=6)
Collection_Query_5000-8                60394.1Ki ± 0%   162.8Ki ± 0%  -99.73% (p=0.002 n=6)
Collection_Query_25000-8              301962.1Ki ± 0%   794.8Ki ± 0%  -99.74% (p=0.002 n=6)
Collection_Query_100000-8             1179.510Mi ± 0%   3.057Mi ± 0%  -99.74% (p=0.002 n=6)
geomean                                  49.13Mi        155.0Ki       -99.69%

                                    │     before     │               after               │
                                    │   allocs/op    │ allocs/op   vs base               │
Collection_Query_NoContent_100-8         238.00 ± 0%   44.00 ± 0%  -81.51% (p=0.002 n=6)
Collection_Query_NoContent_1000-8       2038.50 ± 0%   44.00 ± 0%  -97.84% (p=0.002 n=6)
Collection_Query_NoContent_5000-8      10039.00 ± 0%   44.00 ± 0%  -99.56% (p=0.002 n=6)
Collection_Query_NoContent_25000-8     50038.00 ± 0%   44.00 ± 0%  -99.91% (p=0.002 n=6)
Collection_Query_NoContent_100000-8   200038.00 ± 0%   44.00 ± 0%  -99.98% (p=0.002 n=6)
Collection_Query_100-8                   238.00 ± 0%   44.00 ± 0%  -81.51% (p=0.002 n=6)
Collection_Query_1000-8                 2038.00 ± 0%   44.00 ± 0%  -97.84% (p=0.002 n=6)
Collection_Query_5000-8                10038.00 ± 0%   44.00 ± 0%  -99.56% (p=0.002 n=6)
Collection_Query_25000-8               50038.00 ± 0%   44.00 ± 0%  -99.91% (p=0.002 n=6)
Collection_Query_100000-8             200038.50 ± 0%   44.00 ± 0%  -99.98% (p=0.002 n=6)
geomean                                  8.661k        44.00       -99.49%

Benchmarked on Framework Laptop 13 (first generation).

Benchmarked before the first commit of this PR, and after.

Benchmarked with: go test -benchmem -run=^$ -count 6 -bench . (6 counts because benchstat (used for printing the diff shown ⬆️ ) asks for it).

Not relevant for single query, but for concurrent ones
For now we check this by computing the length. In the future
we could pass a flag if it's already known whether a vector
is normalized, which is the case for many embedding models.
Greatly reduces number of allocations. For a query of 5,000 documents
from ~5000 allocations to ~50.
Number of allocations are also now constant, i.e. 50 for querying
100,000 documents.
- Normalizes only once instead of each time
- Embedding creation takes time anyway, while query should be as fast as possible
@philippgille philippgille merged commit acb1e3f into main Mar 16, 2024
2 checks passed
@philippgille philippgille deleted the query-perf branch March 16, 2024 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant