fix: reduce index load memory spike #204

ahuber21 · 2025-10-28T16:46:44Z

NOTE: This comes from copilot and it's simply supposed to kick off the discussion. Take the explanation with a grain of salt.

Summary

Mitigates the transient ~3× RSS spike observed when loading large SVS indexes by deferring page population and avoiding oversized hugepage attempts. Memory mappings no longer eagerly fault in pages unless explicitly requested.

Root Cause

Version 0.0.8 introduced iterative hugepage allocation plus unconditional use of MAP_POPULATE (and for file mappings also MAP_POPULATE + MAP_NORESERVE). When loading an index:

Multiple anonymous mappings were attempted (1 GiB, 2 MiB, then 4 KiB).
Each attempt used MAP_POPULATE, immediately touching pages.
Falls back left earlier mappings briefly resident before unmapping, producing a short-lived peak ~3× the steady-state index size.

Key Changes

File: ScalableVectorSearch/include/svs/core/allocator.h

Added opt-in environment variable SVS_ENABLE_MAP_POPULATE.
- Default behavior: Do NOT add MAP_POPULATE.
- If SVS_ENABLE_MAP_POPULATE is set (non-empty), hugepage and file-backed mappings will use MAP_POPULATE (Linux only).
Added heuristic to skip hugepage options whose rounded allocation would exceed 2 * requested_bytes.
- Prevents allocating vastly larger temporary anonymous mappings prior to fallback.
Updated hugepage allocation loop to apply the heuristic before calling mmap.
Updated MemoryMapper to respect SVS_ENABLE_MAP_POPULATE (removed unconditional population).
Added documentation comment explaining the new environment variable.

No public API signatures changed; behavior is controlled purely by an environment knob.

Behavior Before vs After

Scenario	RSS right after load	Steady-state RSS	Notes
Previous (always populated)	~2–3× index size	~1× index size	Multiple populated mappings overlap briefly
New default (no populate)	~1–1.2× index size	~1× index size	Minor transient overhead only for construction buffers
With `SVS_ENABLE_MAP_POPULATE=1`	Similar to previous	~1×	Opt-in if early page faults desired for latency benchmarking

Backwards Compatibility

Default runtime memory behavior is improved (lower spike).
Users needing deterministic page residency for benchmarks can set SVS_ENABLE_MAP_POPULATE.
No API or ABI changes; only allocator internals.

Performance Impact

Initial access to pages may incur slightly more page faults without MAP_POPULATE. Typically amortized and negligible for sustained search workloads.
For latency-sensitive cold-start benchmarking, enabling SVS_ENABLE_MAP_POPULATE restores prior eager population.

How to Test / Verify

Build SVS as usual.

Run an index load workload twice:

Baseline (default):

/usr/bin/time -v ./your_index_loader_binary
grep -E '^VmRSS|^VmPeak' /proc/$(pgrep -n your_index_loader_binary)/status

With population:

SVS_ENABLE_MAP_POPULATE=1 /usr/bin/time -v ./your_index_loader_binary

Compare peak RSS (VmPeak). Expect a pronounced reduction in the default run.

Optional fine-grained script:

python - <<'PY'
import psutil, time, os, subprocess
p = subprocess.Popen(["./your_index_loader_binary"]))
rss_log=[]
while p.poll() is None:
   	 rss_log.append(psutil.Process(p.pid).memory_info().rss)
   	 time.sleep(0.05)
print("Peak RSS MiB:", max(rss_log)/1024/1024)
PY

Risks & Mitigations

Risk: Some users relied silently on eager population for consistent first-query latency.
- Mitigation: Clear opt-in via SVS_ENABLE_MAP_POPULATE.
Risk: Heuristic (> 2× requested) might skip large hugepages that could be efficient.
- Mitigation: Value is conservative; follow-up can expose a tunable threshold or dynamic estimation.

Follow-Up Opportunities

Add an indexed configuration knob rather than environment variable (e.g., advanced build parameter).
Track allocation attempts with optional logging flag for diagnostics.
Consider adaptive page population: populate only final mapping after successful allocation.
Provide a micro-benchmark measuring cold-start query latency under both modes.

Documentation

Added inline comment in allocator.h. A short note should be added to HISTORY.md / README.md describing the new env var (pending in this PR if not yet added).

Checklist

Internal allocator changes applied
Resolve build warnings/exception macro issues if still present (ensure CI passes)
Add release note entry
(Optional) Add memory regression test script

…_MAP_POPULATE

fix: reduce index load memory spike; MAP_POPULATE requires SVS_ENABLE…

05b49f0

…_MAP_POPULATE

ahuber21 closed this Oct 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: reduce index load memory spike #204

fix: reduce index load memory spike #204

Uh oh!

ahuber21 commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: reduce index load memory spike #204

fix: reduce index load memory spike #204

Uh oh!

Conversation

ahuber21 commented Oct 28, 2025

Summary

Root Cause

Key Changes

Behavior Before vs After

Backwards Compatibility

Performance Impact

How to Test / Verify

Risks & Mitigations

Follow-Up Opportunities

Documentation

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants