Skip to content

Commit 9d2bf93

Browse files
authored
Merge pull request #2963 from hodgesds/scxtop-mcp
scxtop: Add Model Context Protocol (MCP) server
2 parents bd6de22 + eb134a9 commit 9d2bf93

26 files changed

+7644
-40
lines changed

tools/scxtop/CLAUDE_INTEGRATION.md

Lines changed: 441 additions & 0 deletions
Large diffs are not rendered by default.

tools/scxtop/MCP_INTEGRATIONS.md

Lines changed: 382 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,382 @@
1+
# scxtop MCP Server - Complete Integration Reference
2+
3+
This document provides a comprehensive overview of all MCP (Model Context Protocol) integrations available in the scxtop MCP server.
4+
5+
## Overview
6+
7+
The scxtop MCP server provides AI assistants with programmatic access to Linux scheduler metrics, BPF events, hardware topology, and performance profiling capabilities. It implements the MCP specification (protocol version 2024-11-05) and supports both one-shot queries and daemon mode with real-time event streaming.
8+
9+
## Server Information
10+
11+
- **Server Name**: `scxtop-mcp`
12+
- **Version**: (from Cargo.toml)
13+
- **Protocol Version**: `2024-11-05`
14+
- **Modes**:
15+
- One-shot mode (stdio)
16+
- Daemon mode (continuous with event streaming)
17+
18+
## Capabilities
19+
20+
### 1. Resources (17 URIs)
21+
22+
Resources are read-only data endpoints that provide access to system metrics and configuration.
23+
24+
#### Scheduler Resources
25+
26+
| URI | Description | Data Type |
27+
|-----|-------------|-----------|
28+
| `scheduler://current` | Currently active scheduler name, class (sched_ext or other), and state | JSON |
29+
| `stats://scheduler/raw` | Raw JSON statistics from the scheduler's scx_stats framework | JSON |
30+
| `stats://scheduler/scx` | Kernel-level sched_ext statistics and counters | JSON |
31+
32+
#### Topology Resources
33+
34+
| URI | Description | Data Type |
35+
|-----|-------------|-----------|
36+
| `topology://info` | Hardware topology including CPUs, cores, LLCs, NUMA nodes with IDs and mappings | JSON |
37+
38+
#### Aggregated Statistics
39+
40+
| URI | Description | Data Type |
41+
|-----|-------------|-----------|
42+
| `stats://aggregated/cpu` | Per-CPU statistics including utilization, frequency, scheduling metrics | JSON |
43+
| `stats://aggregated/llc` | Statistics aggregated by last-level cache domain | JSON |
44+
| `stats://aggregated/node` | Statistics aggregated by NUMA node | JSON |
45+
| `stats://aggregated/dsq` | Dispatch queue statistics for sched_ext schedulers (latencies, depths, vtime) | JSON |
46+
| `stats://aggregated/process` | Per-process scheduler statistics including runtime, vtime, layer info | JSON |
47+
48+
#### System-Wide Statistics
49+
50+
| URI | Description | Data Type |
51+
|-----|-------------|-----------|
52+
| `stats://system/cpu` | System-wide CPU utilization statistics and context switch rates | JSON |
53+
| `stats://system/memory` | System memory statistics (total, free, cached, etc.) | JSON |
54+
| `stats://system/network` | Network interface statistics | JSON |
55+
56+
#### Profiling and Events
57+
58+
| URI | Description | Data Type |
59+
|-----|-------------|-----------|
60+
| `events://perf` | List of all available perf tracepoint events organized by subsystem | JSON |
61+
| `events://kprobe` | List of all available kernel functions for kprobe profiling | JSON |
62+
| `bpf://programs` | Currently loaded BPF programs with runtime statistics | JSON |
63+
| `profiling://perf/status` | Current perf profiling status (running/stopped, sample count, duration) | JSON |
64+
| `profiling://perf/results` | Symbolized stack traces from perf profiling (kernel and userspace, top 50 symbols) | JSON |
65+
66+
#### Event Streaming (Daemon Mode Only)
67+
68+
| URI | Description | Data Type |
69+
|-----|-------------|-----------|
70+
| `events://stream` | Real-time stream of BPF scheduler events (requires daemon mode and subscription) | NDJSON |
71+
72+
**Supported Event Types** (when subscribed to `events://stream`):
73+
- Scheduling: `sched_switch`, `sched_wakeup`, `sched_waking`, `sched_wakeup_new`, `sched_migrate_task`
74+
- Process lifecycle: `fork`, `exec`, `exit`, `wait`
75+
- System events: `softirq`, `ipi`, `cpuhp_enter`, `cpuhp_exit`, `hw_pressure`
76+
- Profiling: `kprobe`, `perf_sample`
77+
- Scheduler-specific: `sched_hang`, `sched_cpu_perf_set`
78+
- Special: `mango_app`, `trace_started`, `trace_stopped`, `system_stat`, `sched_stats`
79+
80+
### 2. Tools (6 Interactive Functions)
81+
82+
Tools are callable functions that perform queries or actions.
83+
84+
#### `query_stats`
85+
86+
Discover available statistics resources and how to query them.
87+
88+
**Parameters**:
89+
- `stat_type` (optional): Filter by type: `cpu`, `llc`, `node`, `dsq`, `process`, `scheduler`, `system`
90+
91+
**Returns**: List of available resource URIs with descriptions and usage examples.
92+
93+
#### `get_topology`
94+
95+
Get detailed hardware topology with core/LLC/node mappings.
96+
97+
**Parameters**:
98+
- `detail_level` (optional, default: `summary`): `summary` or `full`
99+
- `summary`: High-level counts and SMT status
100+
- `full`: Complete per-CPU, per-core, per-LLC, per-node details with frequencies and capacities
101+
- `include_offline` (optional, default: `false`): Include offline CPUs
102+
103+
**Returns**: JSON object with topology information based on detail level.
104+
105+
#### `list_events`
106+
107+
List available profiling events filtered by subsystem.
108+
109+
**Parameters**:
110+
- `subsystem` (**required**): Filter perf events by subsystem (e.g., `sched`, `irq`, `power`, `block`, `net`)
111+
- `event_type` (optional, default: `perf`): `kprobe`, `perf`, or `all`
112+
113+
**Returns**: JSON object with filtered events, count, and subsystem information. On error, lists available subsystems.
114+
115+
**Example**:
116+
```json
117+
{
118+
"subsystem": "sched",
119+
"event_type": "perf"
120+
}
121+
```
122+
123+
#### `start_perf_profiling`
124+
125+
Start perf profiling with stack trace collection and symbolization.
126+
127+
**Parameters**:
128+
- `event` (optional, default: `hw:cpu-clock`): Event to profile
129+
- Hardware events: `hw:cpu-clock`
130+
- Software events: `sw:task-clock`
131+
- Tracepoints: `tracepoint:subsystem:event` (e.g., `tracepoint:sched:sched_switch`)
132+
- `freq` (optional, default: `99`): Sampling frequency in Hz
133+
- `cpu` (optional, default: `-1`): CPU to profile (-1 for all CPUs, specific CPU ID otherwise)
134+
- `pid` (optional, default: `-1`): Process ID to profile (-1 for system-wide)
135+
- `max_samples` (optional, default: `10000`): Maximum samples to collect (0 for unlimited)
136+
- `duration_secs` (optional, default: `0`): Duration in seconds (0 for manual stop)
137+
138+
**Returns**: Confirmation with profiling configuration.
139+
140+
**Example**:
141+
```json
142+
{
143+
"event": "hw:cpu-clock",
144+
"freq": 99,
145+
"duration_secs": 10,
146+
"max_samples": 0
147+
}
148+
```
149+
150+
#### `stop_perf_profiling`
151+
152+
Stop perf profiling and prepare results for retrieval.
153+
154+
**Parameters**: None
155+
156+
**Returns**: Status object with sample count, duration, and profiling state.
157+
158+
#### `get_perf_results`
159+
160+
Retrieve symbolized stack traces and top functions from perf profiling.
161+
162+
**Parameters**:
163+
- `limit` (optional, default: `50`): Number of top symbols to return
164+
- `include_stacks` (optional, default: `true`): Include full symbolized stack traces
165+
166+
**Returns**: JSON object with:
167+
- Top symbols ranked by sample count with percentages
168+
- Symbolized stack traces (if `include_stacks` is true)
169+
- Kernel and userspace function names
170+
- Sample statistics
171+
172+
**Example**:
173+
```json
174+
{
175+
"limit": 20,
176+
"include_stacks": true
177+
}
178+
```
179+
180+
### 3. Prompts (5 Guided Workflows)
181+
182+
Prompts are pre-defined analysis workflows that guide the AI through complex investigations.
183+
184+
#### `analyze_scheduler_performance`
185+
186+
Comprehensive scheduler performance analysis workflow.
187+
188+
**Arguments**:
189+
- `focus_area` (optional): `latency`, `throughput`, `balance`, or `general`
190+
- `latency`: Focus on dispatch queue latencies, wakeup delays
191+
- `throughput`: Focus on context switch rates, CPU utilization, migration patterns
192+
- `balance`: Focus on load distribution across CPUs, LLCs, NUMA nodes
193+
- `general`: Comprehensive overview of all aspects
194+
195+
**Returns**: Detailed workflow instructions for analyzing scheduler performance based on focus area.
196+
197+
#### `debug_high_latency`
198+
199+
Debug high scheduling latency issues with step-by-step investigation.
200+
201+
**Arguments**:
202+
- `pid` (optional): Process ID to investigate (if not specified, system-wide analysis)
203+
204+
**Returns**: Workflow for identifying latency bottlenecks, analyzing wakeup patterns, checking hardware factors, and suggesting remediation.
205+
206+
#### `analyze_cpu_imbalance`
207+
208+
Analyze CPU load imbalance and migration patterns.
209+
210+
**Arguments**: None
211+
212+
**Returns**: Workflow for measuring imbalance severity, understanding topology, identifying migration patterns, analyzing task characteristics, and determining root causes.
213+
214+
#### `investigate_scheduler_behavior`
215+
216+
Deep dive into scheduler behavior and policies.
217+
218+
**Arguments**:
219+
- `scheduler_name` (optional): Specific scheduler to analyze (e.g., `scx_rusty`, `scx_lavd`)
220+
221+
**Returns**: Workflow for examining dispatch queue behavior, monitoring scheduling decisions, analyzing task placement patterns, and comparing against expected behavior.
222+
223+
#### `summarize_system`
224+
225+
Comprehensive system and scheduler summary.
226+
227+
**Arguments**: None
228+
229+
**Returns**: Workflow for gathering complete system overview including hardware topology, active scheduler, system-wide statistics, resource distribution, top processes, and available monitoring capabilities.
230+
231+
## Usage Examples
232+
233+
### Claude Desktop Configuration
234+
235+
Add to `claude_desktop_config.json`:
236+
237+
```json
238+
{
239+
"mcpServers": {
240+
"scxtop": {
241+
"command": "/path/to/scxtop",
242+
"args": ["--mcp"]
243+
}
244+
}
245+
}
246+
```
247+
248+
### Query Examples
249+
250+
**Basic resource read**:
251+
```
252+
"What scheduler is currently running?"
253+
→ Claude reads: scheduler://current
254+
```
255+
256+
**Using tools**:
257+
```
258+
"Show me the hardware topology"
259+
→ Claude calls: get_topology with detail_level="full"
260+
```
261+
262+
**Using prompts**:
263+
```
264+
"Analyze scheduler latency issues"
265+
→ Claude invokes: analyze_scheduler_performance prompt with focus_area="latency"
266+
```
267+
268+
**Profiling workflow**:
269+
```
270+
"Profile the system and show me the hottest kernel functions"
271+
→ Claude calls: start_perf_profiling with event="hw:cpu-clock", freq=99
272+
→ (waits or sets duration)
273+
→ Claude calls: stop_perf_profiling
274+
→ Claude calls: get_perf_results with limit=20, include_stacks=true
275+
→ Claude analyzes and presents results
276+
```
277+
278+
**Event monitoring (daemon mode)**:
279+
```
280+
"Monitor scheduling events and alert me if you see high latency"
281+
→ Claude subscribes to: events://stream
282+
→ Claude filters sched_switch events for dsq_lat_us > 1000
283+
→ Claude reports anomalies in real-time
284+
```
285+
286+
## Implementation Details
287+
288+
### File Organization
289+
290+
- `src/mcp/server.rs` - Main MCP server implementation and request handling
291+
- `src/mcp/resources.rs` - Resource registration and data retrieval
292+
- `src/mcp/tools.rs` - Tool implementations
293+
- `src/mcp/prompts.rs` - Workflow prompt definitions
294+
- `src/mcp/protocol.rs` - MCP protocol types and structures
295+
- `src/mcp/events.rs` - BPF event to MCP event conversion
296+
- `src/mcp/bpf_stats.rs` - BPF program statistics collector
297+
- `src/mcp/perf_profiling.rs` - Perf profiling engine with symbolization
298+
299+
### Resource Handler Registration
300+
301+
Resources are registered with closures that capture necessary state (e.g., topology, BPF stats collector):
302+
303+
```rust
304+
self.resources.register_handler("topology://info".to_string(), move || {
305+
Ok(serde_json::json!({
306+
"nr_cpus": topo.all_cpus.len(),
307+
// ... topology data
308+
}))
309+
});
310+
```
311+
312+
### Event Streaming Architecture
313+
314+
1. MCP server creates an unbounded channel
315+
2. Resources module holds the sender
316+
3. Main scxtop BPF event loop pushes events via `push_event()`
317+
4. Events are converted to JSON and streamed to the client
318+
5. Client subscribes to `events://stream` resource
319+
320+
## Daemon Mode vs One-Shot Mode
321+
322+
### One-Shot Mode
323+
```bash
324+
scxtop --mcp
325+
```
326+
- Processes one MCP request cycle
327+
- No event streaming
328+
- Exits after initialize + first request
329+
- Suitable for CLI usage with Claude Code
330+
331+
### Daemon Mode
332+
```bash
333+
scxtop --mcp-daemon
334+
```
335+
- Runs continuously
336+
- Enables `events://stream` resource
337+
- Real-time BPF event streaming
338+
- Suitable for Claude Desktop long-running sessions
339+
- Allows monitoring and proactive analysis
340+
341+
## Statistics Collection
342+
343+
The MCP server integrates with scxtop's existing statistics infrastructure:
344+
345+
- **scx_stats framework**: Reads raw scheduler statistics from sched_ext schedulers
346+
- **BPF programs**: Collects per-CPU, per-process, per-DSQ metrics
347+
- **System stats**: Reads `/proc` and `/sys` for system-wide metrics
348+
- **BPF program stats**: Monitors loaded BPF programs via bpffs
349+
- **Perf profiling**: Uses perf_event_open() for stack trace collection
350+
- **Symbolization**: Resolves kernel and userspace addresses to function names
351+
352+
## Security Considerations
353+
354+
- Requires root or CAP_BPF/CAP_PERFMON capabilities
355+
- Accesses /sys/kernel/debug/tracing (tracefs)
356+
- Reads /proc filesystem
357+
- Attaches BPF programs to tracepoints
358+
- Can profile system-wide or specific processes
359+
- No authentication mechanism (local stdio only)
360+
361+
## Performance Impact
362+
363+
- **Resource reads**: Minimal impact, reads from in-memory stats
364+
- **Event streaming**: Low overhead, BPF programs filter in kernel
365+
- **Perf profiling**: Configurable sampling rate (default 99 Hz)
366+
- **Symbolization**: Performed in userspace, cached for efficiency
367+
368+
## Future Enhancements
369+
370+
Potential areas for expansion:
371+
- Additional resource types (disk I/O, interrupts)
372+
- More granular event filtering
373+
- Historical data retention and querying
374+
- Flamegraph generation
375+
- Scheduler parameter tuning via tools
376+
- Integration with additional profiling tools (eBPF-based)
377+
378+
## References
379+
380+
- MCP Specification: https://spec.modelcontextprotocol.io/
381+
- scxtop documentation: README.md, CLAUDE_INTEGRATION.md
382+
- sched_ext documentation: https://github.com/sched-ext/scx

0 commit comments

Comments
 (0)