-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow writer performance with the current default heap size #118
Comments
The tantivy docs for the writer settings don't describe the consequences of setting the heap larger or smaller. I'd be happy to make improvements to those docs once I understand those consequences myself ;) |
Based on reading some threads on discord, is this the same setting on quickwit, that is currently default to 2GB? https://quickwit.io/docs/configuration/index-config#indexer-memory-usage |
Please have a look at the thread over at quickwit-oss/tantivy#2156 (comment) The main point is that the memory accounting got more accurate, meaning the indexer used to use more memory than configured per the buffer limit. Now it is much closer to staying within that limit but this also means that the same nominal limit implies less buffering and more commits which is what you are experiencing. I think the main thing here is that the Rust bindings force one to make a choice via the mandatory (Additionally, I think the actual memory consumption has somewhat increased due to the new columnar fast field storage. But whether this really affects a given use case also depends on the schema and data in question.) |
@adamreichold Circling back to this discussion.
While upgrading another application to use current head tantivy-py, I am finding that the default heap limit of 3000000 seems to cause very frequent commits while adding documents. It just doesn't seem large enough. I can improve performance by increasing the heap size, but I'm thinking the current default is going cause surprisingly poor performance for a lot of people once they upgrade.
What are your thoughts on this? Is there a more typical "good" value to use as a default? I am not familiar with the tantivy work between 0.19.2 and 0.20.1 that led to this apparent change in behaviour.
The text was updated successfully, but these errors were encountered: