-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix ByteBlockPool integer overflow by implementing buffer limit detection #15330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
But this is not supported: the limits on IndexWriter are 2GB |
maybe AI-generated? The bullet point formatting looks characteristic. Not that that is banned or anything, but it might need additional scrutiny |
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
Hi @rmuir @msokolov , For context , I originally created this issue #15152 - where an opensearch user encountered the byteblockpool overflow during recovery.
I think the check for
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
Problem
ByteBlockPool uses 32KB buffers with an integer offset tracker ( byteOffset). When more than 65,535 buffers are allocated, integer overflow occurs in the byteOffset calculation (byteOffset = bufferUpto * BYTE_BLOCK_SIZE), causing ArithmeticException during indexing of documents with large numbers of tokens.
Root Cause
Solution
Implement proactive DWPT flushing when buffer count approaches the limit:
Key Changes