Skip to content

Conversation

@hodgesds
Copy link
Contributor

@hodgesds hodgesds commented Oct 24, 2025

Add a new configuration option llc_sticky_runs that controls how many times a task must run on its current LLC before being allowed to migrate to a different LLC. This provides finer-grained control over LLC locality and complements the existing xllc_mig_min_us option.

The implementation tracks the number of consecutive runs on the current LLC in the task context (taskc->llc_runs). When the scheduler attempts to find an idle CPU and no idle CPUs are available on the current LLC, it checks if the task has run fewer times than the configured threshold. If so, cross-LLC migration is prevented, keeping the task sticky to its current LLC to preserve cache locality.

Key changes:

  • Add llc_runs counter to task_ctx to track consecutive runs on current LLC
  • Add llc_sticky_runs configuration field to layer struct and LayerConfig in config.rs
  • Increment llc_runs in layered_running() when stickiness is enabled
  • Reset llc_runs to 0 in maybe_update_task_llc() on LLC migration
  • Add stickiness check in pick_idle_cpu() to prevent cross-LLC migration when task hasn't run enough times on current LLC
  • Add LSTAT_LLC_STICKY_SKIP statistic to track prevented migrations
  • Initialize llc_runs to 0 in layered_init_task()

The feature is disabled by default (llc_sticky_runs = 0) and can be configured per-layer. Example configuration:

{
"name": "batch",
"kind": {
"Confined": {
"llc_sticky_runs": 20,
}
}
}

This would keep tasks on their current LLC for 20 runs before allowing cross-LLC migration, providing stronger LLC affinity for workloads that benefit from cache locality.

The new statistic appears in the output as:
xllc_mig/skip/sticky_skip={xllc_migration%}/{xllc_skip%}/{llc_sticky_skip%}

@hodgesds
Copy link
Contributor Author

hodgesds commented Oct 24, 2025

This doesn't work 100% of the time because during dispatch task can be forcefully migrated off their LLC DSQs, but that's maybe ok. Added support at dispatch time using iterators.

Add a new configuration option `llc_sticky_runs` that controls how many
times a task must run on its current LLC before being allowed to migrate
to a different LLC. This provides finer-grained control over LLC locality
and complements the existing `xllc_mig_min_us` option.

The implementation tracks the number of consecutive runs on the current
LLC in the task context (`taskc->llc_runs`). When the scheduler attempts
to find an idle CPU and no idle CPUs are available on the current LLC,
it checks if the task has run fewer times than the configured threshold.
If so, cross-LLC migration is prevented, keeping the task sticky to its
current LLC to preserve cache locality.

Key changes:

- Add `llc_runs` counter to `task_ctx` to track consecutive runs on
  current LLC
- Add `llc_sticky_runs` configuration field to `layer` struct and
  `LayerConfig` in config.rs
- Increment `llc_runs` in `layered_running()` when stickiness is enabled
- Reset `llc_runs` to 0 in `maybe_update_task_llc()` on LLC migration
- Add stickiness check in `pick_idle_cpu()` to prevent cross-LLC
  migration when task hasn't run enough times on current LLC
- Add `LSTAT_LLC_STICKY_SKIP` statistic to track prevented migrations
- Initialize `llc_runs` to 0 in `layered_init_task()`

The feature is disabled by default (llc_sticky_runs = 0) and can be
configured per-layer. Example configuration:

  {
    "name": "batch",
    "kind": {
      "Confined": {
        "llc_sticky_runs": 20,
        "xllc_mig_min_us": 1000.0
      }
    }
  }

This would keep tasks on their current LLC for 20 runs before allowing
cross-LLC migration, providing stronger LLC affinity for workloads that
benefit from cache locality.

The new statistic appears in the output as:
  xllc_mig/skip={xllc_migration%}/{xllc_skip%}/{llc_sticky_skip%}

Signed-off-by: Daniel Hodges <[email protected]>
@hodgesds
Copy link
Contributor Author

Added the dispatch side and running stress-ng with --run-example shows sticky skips:

open_idle= 0.00 mig= 1.38 xnuma_mig= 0.10 xllc_mig/skip/sticky_skip= 0.36/ 0.00/ 0.01 affn_viol= 0.00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants