Skip to content

Conversation

@ahuber21
Copy link
Contributor

NOTE: This comes from copilot and it's simply supposed to kick off the discussion. Take the explanation with a grain of salt.

Summary

Mitigates the transient ~3× RSS spike observed when loading large SVS indexes by deferring page population and avoiding oversized hugepage attempts. Memory mappings no longer eagerly fault in pages unless explicitly requested.

Root Cause

Version 0.0.8 introduced iterative hugepage allocation plus unconditional use of MAP_POPULATE (and for file mappings also MAP_POPULATE + MAP_NORESERVE). When loading an index:

  1. Multiple anonymous mappings were attempted (1 GiB, 2 MiB, then 4 KiB).
  2. Each attempt used MAP_POPULATE, immediately touching pages.
  3. Falls back left earlier mappings briefly resident before unmapping, producing a short-lived peak ~3× the steady-state index size.

Key Changes

File: ScalableVectorSearch/include/svs/core/allocator.h

  1. Added opt-in environment variable SVS_ENABLE_MAP_POPULATE.

    • Default behavior: Do NOT add MAP_POPULATE.
    • If SVS_ENABLE_MAP_POPULATE is set (non-empty), hugepage and file-backed mappings will use MAP_POPULATE (Linux only).
  2. Added heuristic to skip hugepage options whose rounded allocation would exceed 2 * requested_bytes.

    • Prevents allocating vastly larger temporary anonymous mappings prior to fallback.
  3. Updated hugepage allocation loop to apply the heuristic before calling mmap.

  4. Updated MemoryMapper to respect SVS_ENABLE_MAP_POPULATE (removed unconditional population).

  5. Added documentation comment explaining the new environment variable.

No public API signatures changed; behavior is controlled purely by an environment knob.

Behavior Before vs After

Scenario RSS right after load Steady-state RSS Notes
Previous (always populated) ~2–3× index size ~1× index size Multiple populated mappings overlap briefly
New default (no populate) ~1–1.2× index size ~1× index size Minor transient overhead only for construction buffers
With SVS_ENABLE_MAP_POPULATE=1 Similar to previous ~1× Opt-in if early page faults desired for latency benchmarking

Backwards Compatibility

  • Default runtime memory behavior is improved (lower spike).
  • Users needing deterministic page residency for benchmarks can set SVS_ENABLE_MAP_POPULATE.
  • No API or ABI changes; only allocator internals.

Performance Impact

  • Initial access to pages may incur slightly more page faults without MAP_POPULATE. Typically amortized and negligible for sustained search workloads.
  • For latency-sensitive cold-start benchmarking, enabling SVS_ENABLE_MAP_POPULATE restores prior eager population.

How to Test / Verify

  1. Build SVS as usual.
  2. Run an index load workload twice:
    • Baseline (default):
      /usr/bin/time -v ./your_index_loader_binary
      grep -E '^VmRSS|^VmPeak' /proc/$(pgrep -n your_index_loader_binary)/status
    • With population:
      SVS_ENABLE_MAP_POPULATE=1 /usr/bin/time -v ./your_index_loader_binary
  3. Compare peak RSS (VmPeak). Expect a pronounced reduction in the default run.
  4. Optional fine-grained script:
    python - <<'PY'
    import psutil, time, os, subprocess
    p = subprocess.Popen(["./your_index_loader_binary"]))
    rss_log=[]
    while p.poll() is None:
       	 rss_log.append(psutil.Process(p.pid).memory_info().rss)
       	 time.sleep(0.05)
    print("Peak RSS MiB:", max(rss_log)/1024/1024)
    PY

Risks & Mitigations

  • Risk: Some users relied silently on eager population for consistent first-query latency.
    • Mitigation: Clear opt-in via SVS_ENABLE_MAP_POPULATE.
  • Risk: Heuristic (> 2× requested) might skip large hugepages that could be efficient.
    • Mitigation: Value is conservative; follow-up can expose a tunable threshold or dynamic estimation.

Follow-Up Opportunities

  • Add an indexed configuration knob rather than environment variable (e.g., advanced build parameter).
  • Track allocation attempts with optional logging flag for diagnostics.
  • Consider adaptive page population: populate only final mapping after successful allocation.
  • Provide a micro-benchmark measuring cold-start query latency under both modes.

Documentation

  • Added inline comment in allocator.h. A short note should be added to HISTORY.md / README.md describing the new env var (pending in this PR if not yet added).

Checklist

  • Internal allocator changes applied
  • Resolve build warnings/exception macro issues if still present (ensure CI passes)
  • Add release note entry
  • (Optional) Add memory regression test script

@ahuber21 ahuber21 closed this Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants