-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: [ milvus-2.4.4 ] error="incomplete query result, missing id xxxxxxx, len(searchIDs) = 40, len(queryIDs) = 20 #34021
Comments
@laozhu1900 Hello, could you please provide the collection schema? |
PrimaryKey VarChar(218), |
Hi, @laozhu1900, I am unable to reproduce this issue. Could you please help confirm if my script differs from yours? If there are no differences, could you try running it to see if the issue persists? import time
import string
import random
import numpy as np
from pymilvus import (
connections,
utility,
FieldSchema, CollectionSchema, DataType,
Collection,
)
fmt = "\n=== {:30} ===\n"
search_latency_fmt = "search latency = {:.4f}s"
num_entities, dim = 1218, 256
str_length = 218
#################################################################################
# connect to Milvus
print(fmt.format("start connecting to Milvus"))
connections.connect("default", host="localhost", port="19530")
has = utility.has_collection("hello_milvus")
print(f"Does collection hello_milvus exist in Milvus: {has}")
###############################################################################
print(fmt.format("Drop collection `hello_milvus`"))
utility.drop_collection("hello_milvus")
#################################################################################
# create collection
fields = [
FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=str_length),
FieldSchema(name="s0", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s1", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s2", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s3", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s4", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s5", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s6", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s7", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s8", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="s9", dtype=DataType.VARCHAR, max_length=str_length),
FieldSchema(name="random0", dtype=DataType.DOUBLE),
FieldSchema(name="random1", dtype=DataType.FLOAT),
FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim)
]
schema = CollectionSchema(fields, "hello_milvus is the simplest demo to introduce the APIs")
print(fmt.format("Create collection `hello_milvus`"))
hello_milvus = Collection("hello_milvus", schema, consistency_level="Strong")
################################################################################
# create index
print(fmt.format("Start Creating index IVF_FLAT"))
index = {
"index_type": "IVF_SQ8",
"metric_type": "L2",
"params": {"nlist": 128},
}
hello_milvus.create_index("embeddings", index)
################################################################################
# load
print(fmt.format("Start loading"))
hello_milvus.load()
################################################################################
# insert data
def randomstr(length):
letters = string.ascii_lowercase
return ''.join(random.choice(letters) for i in range(length))
print(fmt.format("Start inserting entities"))
rng = np.random.default_rng(seed=19530)
entities = [
# provide the pk field because `auto_id` is set to False
[str(f'primary_key_{i}') for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
[randomstr(str_length) for i in range(num_entities)],
rng.random(num_entities).tolist(), # field random, only supports list
rng.random(num_entities).tolist(), # field random, only supports list
rng.random((num_entities, dim), np.float32), # field embeddings, supports numpy.ndarray and list
]
insert_result = hello_milvus.upsert(entities)
# hello_milvus.flush()
# print(f"Number of entities in Milvus: {hello_milvus.num_entities}") # check the num_entities
# -----------------------------------------------------------------------------
# search based on vector similarity
print(fmt.format("Start searching based on vector similarity"))
vectors_to_search = entities[-1][-1:]
search_params = {
"metric_type": "L2",
"params": {"nprobe": 10},
}
for i in range(10):
start_time = time.time()
result = hello_milvus.search(vectors_to_search, "embeddings", search_params, limit=1024, output_fields=["*"])
end_time = time.time()
print(search_latency_fmt.format(end_time - start_time)) |
ok , I try it |
/assign @laozhu1900 |
Use this demo, I can't reproduce , In my demo, if the outFields have embeddings, it report error ... |
@laozhu1900 do you happen to have a reprodceable code snippet for sharing, we can try to reproduce it in house. |
@laozhu1900 Have you manually set the timeout for the search request? |
@laozhu1900 Additionally, could you dump all the upsert data and provide it to us? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
邮件已收到,谢谢
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
邮件已收到,谢谢
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
邮件已收到,谢谢
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
邮件已收到,谢谢
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
邮件已收到,谢谢
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
邮件已收到,谢谢
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
邮件已收到,谢谢
|
Is there an existing issue for this?
Environment
Current Behavior
index Type: IVF_SQ8, metric_type:IP.
primaryKey type is Varchar, generate by UUID
15 columns. including Varchar and int64
I upsert 1218 records to milvus two days ago.
when I search, if my TOPK less than 592, it search correctly, if my TOPK more than 592, it report error .
If I execute flush() , TOPK more than 592, it also search correctly.
Is it caused by inconsistencies between indexes and data ?
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: