Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os error 22 when indexing #370

Open
mmisiewicz opened this issue Dec 12, 2024 · 1 comment
Open

os error 22 when indexing #370

mmisiewicz opened this issue Dec 12, 2024 · 1 comment

Comments

@mmisiewicz
Copy link

On MacOS 15.2, running postgres 16 from home-brew, and building lantern from master, the following inscrutable errors are thrown when using an external index:

[+] [Lantern External Index] New connection: 127.0.0.1:63501
[*] [Lantern External Index] Number of available CPU cores: 20
[*] [Lantern External Index] Index Params - pq: false, metric_kind: Cos, quantization: I8, dim: 768, m: 34, ef_construction: 256, ef: 256, num_subvectors: 0, num_centroids: 0, element_bits: 32
[*] [Lantern External Index] Creating index with parameters dimensions=768 m=34 ef=256 ef_construction=256, hardware_acceleration=serial
[*] [Lantern External Index] Estimated capcity is 38979200
[+] [Lantern External Index] Indexed 3897920 tuples [speed 1110 tuples/s]...
[+] [Lantern External Index] Indexing took 6948s, indexed 7764322 items
[*] [Lantern External Index] Start streaming index
[+] [Lantern External Index] Writing index to file took 12s722ms
[+] [Lantern External Index] Reading index file took 1s662ms
[X] [Lantern External Index] Indexing error: Invalid argument (os error 22)

The corresponding command on the postgres side:

 create index on  embeddings_denormalized using lantern_hnsw (cast(vec as vector(768)) dist_vec_cos_ops) with (m = 34, ef_construction = 256, ef = 256, dim = 768, quant_bits = 8, external=true) where abs(ulid_hash(page_id) % 100) < 20;
INFO:  done init usearch index
INFO:  connecting to external indexing server on 127.0.0.1:8998
INFO:  successfully connected to external indexing server
ERROR:  external index error: Invalid argument (os error 22)
Time: 6963287.590 ms (01:56:03.288)

The partial index (where) has no impact, I simply put this in because the table is large and debugging this problem is a PITA.

Lantern was invoked using:

lantern-cli start-indexing-server --tmp-dir /opt/homebrew/var/

Thinking that this might be gatekeeper related, I tried changing tmp-dir but it had no impact.

@var77
Copy link
Collaborator

var77 commented Dec 12, 2024

Hi @mmisiewicz , thanks for reporting the issue. Can you try the following cases and see which one will make it, so we can try to understand from where the issue is coming.

  1. Indexing on less data (e.g 10k items) with the same parameters. You can create a table from your original table with 10k items CREATE TABLE embeddings_test AS SELECT * FROM embeddings_denormalized LIMIT 10000; and then run the indexing on embeddings_test table.
  2. If the above will fail again, try indexing without scalar quantization on the embeddings_test table: create index on embeddings_denormalized using lantern_hnsw (cast(vec as vector(768)) dist_vec_cos_ops) with (m = 34, ef_construction = 256, ef = 256, dim = 768, external=true);

Also can you share the data type of vec column? If it is REAL[] you can avoid the cast and use dist_cos_ops directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants