Python Check Support #277

scottopell · 2024-10-21T21:29:00Z

Continuation of #48

To get started:

# If you don't have any python checks defined in your local FS
# Also sets up a minimal virtualenv with check dependencies
make gen-testing-pychecks
# Activate the venv
source ./checks_venv/bin/activate
# Run ADP
DD_API_KEY="foo00001" make run-adp-standalone

Another option is to use the 'converged' image which runs ADP inside the standard docker image, this will attempt to run all default checks out-of-the-box, which is useful for testing.

make build-datadog-agent-image   
docker run -d -it -e DD_USE_DOGSTATSD=false -e DD_HOSTNAME=adp -e DD_API_KEY="foo00001" -e DD_LOG_LEVEL=debug --name adp --rm saluki-images/datadog-agent:latest

# To view a more structured view of checks running/not-running
docker exec adp curl localhost:9999 | jq

pr-commenter · 2024-10-21T22:03:29Z

Regression Detector (DogStatsD)

Regression Detector Results

Run ID: 93705546-53b9-4fc8-aefb-5d50856b32bd

Baseline: 7.55.2
Comparison: 7.55.3

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	+0.05	[+0.01, +0.09]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	+0.02	[-0.01, +0.04]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	+0.02	[-0.03, +0.07]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	+0.01	[-0.01, +0.03]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	+0.00	[-0.02, +0.02]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_100mb_250k_contexts	ingress throughput	-0.00	[-0.04, +0.03]	1
➖	dsd_uds_500mb_3k_contexts	ingress throughput	-0.00	[-0.01, +0.00]	1
➖	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	-0.78	[-0.94, -0.61]	1

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

… build-image steps

…ld profile

… correct

…d state of each incoming check

…ar reasons related to new build-image

datadog-datadog-prod-us1 · 2024-10-25T16:52:05Z

test/smp/regression/saluki/cases/pycheck_lots_of_tags/datadog-agent/checks.d/my-check.py

+
+    individual_tags = []
+    def generate_tag(tag_length):
+        if rng.random() >= unique_tagset_ratio and len(individual_tags) != 0:


🔴 Code Vulnerability

do not use random (...read more)

Make sure to use values that are actually random. The random module in Python should generally not be used and replaced with the secrets module, as noted in the official Python documentation.

Learn More

CWE-330

Python random module documentation

Python secrets module documentation

datadog-datadog-prod-us1 · 2024-10-25T16:52:05Z

test/smp/regression/saluki/cases/pycheck_lots_of_tags/datadog-agent/checks.d/my-check.py

+            # For each metric that gets submitted, choose a tagset at random
+            # This will average out to
+            # contexts = len(tag_sets) as long as num_metrics is greater than num_tagsets
+            self.gauge('hello.world', rng.random() * 1000, tags=rng.choice(tag_sets))


🔴 Code Vulnerability

do not use random (...read more)

Make sure to use values that are actually random. The random module in Python should generally not be used and replaced with the secrets module, as noted in the official Python documentation.

Learn More

CWE-330

Python random module documentation

Python secrets module documentation

pr-commenter · 2024-10-25T17:13:17Z

Regression Detector (Saluki)

Regression Detector Results

Run ID: cb467c25-16cf-447c-b61d-9fb9b58a954e

Baseline: 95d7b85
Comparison: aaea47e

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

Significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
❌	pycheck_lots_of_tags	% cpu utilization	+4674.28	[+4508.71, +4839.85]	1

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
❌	pycheck_lots_of_tags	% cpu utilization	+4674.28	[+4508.71, +4839.85]	1

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

GeorgeHahn · 2024-10-25T21:01:46Z

lib/saluki-components/src/sources/checks/python_exposed_modules.rs

+    match module.getattr("SUBMISSION_QUEUE") {
+        Ok(py_item) => match py_item.extract::<Py<python_scheduler::PythonSenderHolder>>() {
+            Ok(q) => {
+                let res = pyo3::Python::with_gil(|py| q.bind_borrowed(py).borrow_mut().sender.clone());


Nit, I don't think the gil needs to be acquired here - see https://pyo3.rs/main/performance#access-to-bound-implies-access-to-gil-token

ah, you're right, that lets me get rid of this sender.clone as well, good spot!

pr-commenter · 2024-10-28T21:04:16Z

Regression Detector Links

Experiment Result Links

experiment	link(s)
pycheck_lots_of_tags (ADP only)	[Profiling (ADP)] [SMP Dashboard]
dsd_uds_100mb_250k_contexts (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts_distributions_only (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_10mb_3k_contexts (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts_memlimit (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_500mb_3k_contexts (DSD only)	[Profiling (DSD)] [SMP Dashboard]
dsd_uds_512kb_3k_contexts (DSD only)	[Profiling (DSD)] [SMP Dashboard]

github-actions bot added area/core Core functionality, event model, etc. area/components Sources, transforms, and destinations. labels Oct 21, 2024

scottopell mentioned this pull request Oct 21, 2024

Support checks #48

Closed

6 tasks

github-actions bot added the area/ci CI/CD, automated testing, etc. label Oct 21, 2024

scottopell added 8 commits October 22, 2024 15:14

Squashed previous work for simpler rebase

0a1c735

Addresses remaining build errors

e89505a

Adds getting-started make action and readme notes for python checks

8565559

An attempt to get modern cpython installed in build-image

0024157

Updates (all?) images to copy in libs from dd-agent image

84e6bb4

Add embedded files into adp image and move embedded-copy later in the…

4c54c58

… build-image steps

Use upgraded base images that provide agent 'embedded' libs

5cc8bf9

run of make fmt

f5b5ce3

scottopell force-pushed the sopell/checks-cont branch from c0c490e to f5b5ce3 Compare October 22, 2024 15:14

scottopell and others added 4 commits October 22, 2024 15:16

Allow wasmtime deps with modified apache2 license

ae46f9c

minor improvements to the make gen-testing-pychecks target

3c4e0f3

fmt and remove wasmtime (unused)

a1d43b0

Adds correct libraries, adds s6 service for ADP

103e467

github-actions bot added the area/config Configuration. label Oct 22, 2024

scottopell added 11 commits October 23, 2024 17:26

Add support for building agent-data-plane docker image with 'dev' bui…

db0706a

…ld profile

Refactors the way checks are initialized, new behavior should be more…

9e4cfca

… correct

Sets check_id upon successful init

87c7781

Extends liveness api to support arbitrary json details, records loade…

172c3cf

…d state of each incoming check

move check dispatcher into its own file

5116590

Removes un-needed 'Runnable' abstraction only used for local python code

f30ed0b

Adds more detailed check-instance output

f5e1fed

Log errors during check code import

7e20535

Ignores all auto_conf yamls and lowers verbosity of check-metric log

c11072c

Adds required check libraries and defaults to adp image

c5770ad

Adds missing s6 service defs referenced in converged-agent image

a640691

scottopell added 13 commits October 24, 2024 20:58

Updates pyo3 to latest public release

ac87cfa

Fix noop-health

2624bee

Debug why this is failing in CI

45d66c5

Updates license metadata

e10623e

Newer build/smp image

51dbdbe

Fix curl

bd25d7c

use CI config from branch to build baseline image, required for uncle…

7f64c1c

…ar reasons related to new build-image

Fixes formatting

baefc10

Fix checkout ref for baseline build

c752bb7

Fix quoting in gitlab benchmark def

4eb5781

avoid globs, focus on dirs

0c6d273

Support custom check codes in checks.d dir

9f0c00f

Adds pycheck focused smp experiment

90d8891

datadog-datadog-prod-us1 bot reviewed Oct 25, 2024

View reviewed changes

scottopell added 2 commits October 25, 2024 20:44

Remove dsd-experiments

8136eb4

Enables integration-profiling

6ef64d8

GeorgeHahn reviewed Oct 25, 2024

View reviewed changes

Use existing borrowed py rather than re-acquiring the gil

aaea47e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Check Support #277

Python Check Support #277

scottopell commented Oct 21, 2024 •

edited

Loading

pr-commenter bot commented Oct 21, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

datadog-datadog-prod-us1 bot Oct 25, 2024

datadog-datadog-prod-us1 bot Oct 25, 2024

pr-commenter bot commented Oct 25, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

GeorgeHahn Oct 25, 2024

scottopell Oct 28, 2024

pr-commenter bot commented Oct 28, 2024

Python Check Support #277

Are you sure you want to change the base?

Python Check Support #277

Conversation

scottopell commented Oct 21, 2024 • edited Loading

pr-commenter bot commented Oct 21, 2024 • edited Loading

Regression Detector (DogStatsD)

Regression Detector Results

No significant changes in experiment optimization goals

Fine details of change detection per experiment

Explanation

datadog-datadog-prod-us1 bot Oct 25, 2024

Choose a reason for hiding this comment

🔴 Code Vulnerability

Learn More

datadog-datadog-prod-us1 bot Oct 25, 2024

Choose a reason for hiding this comment

🔴 Code Vulnerability

Learn More

pr-commenter bot commented Oct 25, 2024 • edited Loading

Regression Detector (Saluki)

Regression Detector Results

Significant changes in experiment optimization goals

Fine details of change detection per experiment

Explanation

GeorgeHahn Oct 25, 2024

Choose a reason for hiding this comment

scottopell Oct 28, 2024

Choose a reason for hiding this comment

pr-commenter bot commented Oct 28, 2024

Regression Detector Links

Experiment Result Links

scottopell commented Oct 21, 2024 •

edited

Loading

pr-commenter bot commented Oct 21, 2024 •

edited

Loading

pr-commenter bot commented Oct 25, 2024 •

edited

Loading