fix: reduce index load memory spike #204
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Mitigates the transient ~3× RSS spike observed when loading large SVS indexes by deferring page population and avoiding oversized hugepage attempts. Memory mappings no longer eagerly fault in pages unless explicitly requested.
Root Cause
Version 0.0.8 introduced iterative hugepage allocation plus unconditional use of
MAP_POPULATE(and for file mappings alsoMAP_POPULATE+MAP_NORESERVE). When loading an index:MAP_POPULATE, immediately touching pages.Key Changes
File:
ScalableVectorSearch/include/svs/core/allocator.hAdded opt-in environment variable
SVS_ENABLE_MAP_POPULATE.MAP_POPULATE.SVS_ENABLE_MAP_POPULATEis set (non-empty), hugepage and file-backed mappings will useMAP_POPULATE(Linux only).Added heuristic to skip hugepage options whose rounded allocation would exceed
2 * requested_bytes.Updated hugepage allocation loop to apply the heuristic before calling
mmap.Updated
MemoryMapperto respectSVS_ENABLE_MAP_POPULATE(removed unconditional population).Added documentation comment explaining the new environment variable.
No public API signatures changed; behavior is controlled purely by an environment knob.
Behavior Before vs After
SVS_ENABLE_MAP_POPULATE=1Backwards Compatibility
SVS_ENABLE_MAP_POPULATE.Performance Impact
MAP_POPULATE. Typically amortized and negligible for sustained search workloads.SVS_ENABLE_MAP_POPULATErestores prior eager population.How to Test / Verify
Risks & Mitigations
SVS_ENABLE_MAP_POPULATE.> 2× requested) might skip large hugepages that could be efficient.Follow-Up Opportunities
Documentation
allocator.h. A short note should be added toHISTORY.md/README.mddescribing the new env var (pending in this PR if not yet added).Checklist