System tracing provider for tracing via native probes (BPF) #1288

lc525 · 2023-07-07T10:16:00Z

This PR introduces a new tracing provider, enabling dynamic attachment of native (BPF, Systemtap) probes at runtime.
The goal of the provider is to allow correlation between MLServer-specific events and operating system behaviour (system load, performance, resource usage, etc).

Approach

The provider exposes a number of tracepoints (hook functions without any code attached), which can be triggered (fired)
when particular application-level events happen (e.g. a model gets loaded/unloaded, an inference request joins a queue, etc).

At runtime, external probes (BPF programs, Systemtap scripts) can be attached to those tracepoints and to perform tracing actions (measurements, in-linux-kernel data aggregations, correlating MLServer context with OS context, etc).

The underlying implementation creates and dynamically links a native shared library where the tracepoint hooks exist as functions containing just a couple of nop instructions. When external probes are attached, the code of those tracepoints is modified at runtime to jump into the tracing code.

Features

Completely optional feature, with the ability to enable it via the tracepoints extra
The exact tracepoints exposed for external probing configurable via settings
Near-zero overhead when not in use (no external probes attached). Because tracepoints look like normal functions from the perspective of python code, and receive arguments which might need to be computed, the provider offers a way of conditionally computing the tracepoint arguments based on whether an external probe has been attached to a particular tracepoint or not.

Introduced dependencies

When the tracepoints extra is enabled, MLServer will require additional dependencies, as follows:
- stapsdt
- libstapsdt (a native library, in turn requiring libelf)
MLServer will continue to work as normal even if the extra is enabled but the dependencies are not met
The Dockerfile has been updated to allow for container images in which system tracing dependencies are installed

TODOs

Add instrumentation to the inference request path (including for batched requests)
Stabilise tracepoint arguments
Write documentation example for simple usage via bpftrace
Move to "off-by-default" settings in settings/Dockerfile (left to enabled atm for testing)

adriangonz

This looks amazing @lc525! 🚀

Conscious this is still a WIP draft, but I've added a few comments and some questions below. Happy to chat about how you foresee this being used!

mlserver/types/__init__.py

adriangonz · 2023-07-07T13:12:35Z

mlserver/types/tracepoints.py

+ArgTypeMap._prototypes = {
+    Tracepoint.model_load_begin: ArgTypeMap.model_args,
+    Tracepoint.model_load_end: ArgTypeMap.model_args,
+    Tracepoint.model_reload_begin: ArgTypeMap.model_args_reload,
+    Tracepoint.model_reload_end: ArgTypeMap.model_args,
+    Tracepoint.model_unload: ArgTypeMap.model_args,
+    Tracepoint.inference_enqueue_req: ArgTypeMap.queue_args,
+    Tracepoint.inference_begin: ArgTypeMap.infer_args,
+    Tracepoint.inference_end: ArgTypeMap.infer_args,
+}


Thinking out loud here, but the tracepoints and args seem very coupled with the implementation - would it be possible to make those more generic?

My main worry would be whether a small change into any of those implementations (e.g. changing the args for inference) could then cause any side effects around system tracing.

In a way, they do end quite coupled indeed. However, the prototypes essentially become the public interface that external probes use to run their code. Any changes to arguments, besides simply adding new ones at the end of the list, would require changes for whatever external BPF probes people write against the interface. So we would need to stabilise them at some point, after we write more actual probing code.

The idea is to pass to probes the minimal information that is required for meaningful tracing applications. Hopefully, this information remains relatively stable even as MLServer evolves. For example, the actual inference functions might change without us changing the data passed for tracing via the same Tracepoints.

That being said, the _prototypes defined here do not end up being "enforced". Say we add a new argument to the inference_begin tracepoint without changing any of the tp_* functions in the provider. All existing code will continue to work because the function for triggering the tracepoint (Probe.fire) receives a variable number of arguments. We can then selectively call the tracepoint with additional arguments sometimes by directly using the SystemTracingProvider.__call__(tracepoint, *args) (I don't necessarily advocate that it's a sane thing to do, as code might end up being confusing in the long term wrt to what arguments should be passed to which tracepoints).

We can discuss further, perhaps there is a better design to be found here.

Thanks for adding that extra context.

Do you have an example of how it looks on the "other end", i.e. where those probes are used?

tests/testdata/settings-tracing.json

mlserver/sys_tracing/stapsdt_stub.py

mlserver/registry.py

Dockerfile

adriangonz · 2023-07-07T13:20:48Z

Dockerfile

@@ -107,7 +169,12 @@ RUN . $CONDA_PATH/etc/profile.d/conda.sh && \
            pip install $_wheel --constraint ./dist/constraints.txt; \
        done \
    fi && \
-    pip install $(ls "./dist/mlserver-"*.whl) --constraint ./dist/constraints.txt && \
+    if [[ ${#_extras[@]} -gt 0 ]]; then \
+        extras_list=$(IFS=, ; echo "${_extras[@]}") && \ 


Following the same reasoning as the other comment, unless we have some reason to avoid it I'd install all the extras in the Dockerfile - that way the seldonio/mlserver image is always ready as-is to enable tracepoints.

Happy to do that if the decision (given above comments) is to install all extras. I don't know if pip has a simple way of doing that here, I might still have to build a list.

We can also see if Poetry supports an all extras target? Otherwise we can always add one.

(and / or just hardcode a pip install mlserver[extras1, extras2, etc.])

Poetry currently supports --all-extras on install but not export (this is tracked in python-poetry/poetry-plugin-export#45). I've added an everything extra. I've intentionally avoided the name all as that might be reserved in the future by poetry/pip, but we can move to it once there is enough support.

TBH I would just use all as it's more standard. If poetry decides to always add one automatically we can just remove it and use Poetry's built-in target.

Dockerfile

- Allows MLServer to expose native tracepoints that fire on application events; BPF/Systemtap probes can be attached at runtime to those tracepoints, for performing measurements or for linking application activity to OS behaviour (resource consumption and contention, performance variations) - Near-zero overhead when not in use or external probes are not attached - The exposed tracepoints are configured via MLServer settings - Adds (optional) dependencies on * linux-usdt/python-stapsdt * linux-usdt/libstapsdt (native library)

Also update license files to contain dependencies for the `tracepoints` extra

- Only allow tracepoints to be enabled/disabled as a whole rather than having per-tracepoint options; - Refactor code so that all tracepoint/provider code resides in the same module (this has been facilitated by the simplified settings) - Update code for Python 3.8+ (use explicit Union types rather than |)

CLAassistant · 2024-05-22T17:35:08Z

All committers have signed the CLA.

adriangonz reviewed Jul 7, 2023

View reviewed changes

lc525 added 10 commits July 12, 2023 11:09

Update license generation to include poetry extras

5d38039

Also update license files to contain dependencies for the `tracepoints` extra

Update Dockerfile for building mlserver with tracepoints

543a0f0

Initialize system tracing for MLServer and workers

362286f

Fire tracepoints on model lifecycle events

6101a3f

Add tracepoints dependencies to poetry lock

091d08a

Fix testdata for tracepoint settings

1c42607

Fix some linting warnings

eb5c346

Update Dockerfile to install all extras unconditionally

0a4f7f6

lc525 force-pushed the tracepoints branch from f928e90 to 189bf7c Compare July 12, 2023 10:10

Fix linting & support for python 3.8

552e754

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System tracing provider for tracing via native probes (BPF) #1288

System tracing provider for tracing via native probes (BPF) #1288

lc525 commented Jul 7, 2023

adriangonz left a comment

adriangonz Jul 7, 2023

lc525 Jul 7, 2023

adriangonz Jul 11, 2023

adriangonz Jul 7, 2023

lc525 Jul 7, 2023

adriangonz Jul 11, 2023

adriangonz Jul 11, 2023

lc525 Jul 12, 2023

adriangonz Jul 18, 2023

CLAassistant commented May 22, 2024 •

edited

Loading

System tracing provider for tracing via native probes (BPF) #1288

Are you sure you want to change the base?

System tracing provider for tracing via native probes (BPF) #1288

Conversation

lc525 commented Jul 7, 2023

Approach

Features

Introduced dependencies

TODOs

adriangonz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented May 22, 2024 • edited Loading

CLAassistant commented May 22, 2024 •

edited

Loading