-
-
Notifications
You must be signed in to change notification settings - Fork 247
pmdaopenmetrics metric removal qa bug fix #2242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
pmdaopenmetrics metric removal qa bug fix #2242
Conversation
eebf0b8
to
7267dfe
Compare
split_metric = name.split(".") | ||
stripped_metric = name.replace("openmetrics.%s." % self.name, "") | ||
self.pmda.debug("stripped metric --> %s" % stripped_metric) if self.pmda.dbg else None | ||
if stripped_metric == "metric2": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two lines for checking for "metric2" look to be development code that should be removed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmchilton the debug lines are still there BTW, marking 'unresolved' for now.
fname = os.path.join(path, f) | ||
with open(fname, 'r') as name: | ||
f_path = name.readline().strip() | ||
if f_path.startswith("file:///"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmchilton @wcohen I think this approach may be incorrect - why keep mtime on these files? We cannot treat files differently to a curl response from an endpoint. The important part is whether the metrics are still there in the response (whether a file or http endpoint was used) - there should be no need to check filesystem state like this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should suffice for QA (patch below). It doesn't seem to work correctly yet - I think we still have some issues in the PMDA code to resolve - but this removes metric2 (without changing config file timestamps) and reinstates that metric2. Once this works as shown in the .out change, we can have confidence in this PMDA code change.
$ git diff
qa/1976
--- /tmp/git-blob-7FfOb9/1976 2025-08-11 13:57:40.423237960 +1000
+++ qa/1976 2025-08-11 13:57:39.206463438 +1000
@@ -77,6 +77,16 @@ $sudo touch -t 197001010000 $PCP_PMDAS_D
pminfo openmetrics.simple_metric
echo
+echo "-- remove a metric from an existing source --"
+sed -e '/metric2/d' -i $tmp.simple_metric.txt
+pminfo openmetrics.simple_metric
+echo
+
+echo "-- reinstate metric for an existing source --"
+cp $here/openmetrics/samples/simple_metric.txt $tmp.simple_metric.txt
+pminfo openmetrics.simple_metric
+echo
+
_pmdaopenmetrics_remove >/dev/null 2>&1
# success, all done
qa/1976.out
--- /tmp/git-blob-B2e7Ur/1976.out 2025-08-11 14:03:04.574975077 +1000
+++ qa/1976.out 2025-08-11 14:03:00.664966764 +1000
@@ -15,3 +15,10 @@ Error: openmetrics.simple_metric: Unknow
openmetrics.simple_metric.metric2
openmetrics.simple_metric.metric1
+-- remove a metric from an existing source --
+openmetrics.simple_metric.metric1
+
+-- reinstate metric for an existing source --
+openmetrics.simple_metric.metric2
+openmetrics.simple_metric.metric1
+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natoscott @wcohen I just pushed an update with a working / simplifed qa #1976 test.
Regarding the added code in function [traverse] keeping track of files mtime, the reasoning behind this is to signal a change in configuration when actual text files are modified and not just the URL holding the path to the file in the config directory.
old traverse function code picks up a change when $PCP_PMDAS_DIR/openmetrics/config.d/metric.url is modified
new traverse function code picks up a change when $PCP_PMDAS_DIR/openmetrics/config.d/metric.url is modified AND picks up a change when the contents under metric.url (say /some/path/metric.txt) so the contents in metric.txt is modified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how we can't treat files differently from curl responses, but it seems imperative that the code picks up modifications in the text files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmchilton hi Lauren, yes that's right, we certainly need to detect modifications (in both the text files and network stream mode). But, looking at the file modification time is not the way to go here - there is no modification time for network traffic - we need a mechanism that can tell from the data itself whether a metric has been removed.
The approach we use in other places (such as the pmdaCache(3) interfaces), is an algorithm more like this:
- initially mark all metrics as deleted for the source we're currently updating
- as we scan through and encounter each refreshed metric in the current update, mark that metric as present
- at the end of the scan, any metrics that remain not yet marked as present are removed from the namespace and no updated values returned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natoscott OK got it- thank you for clarifying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a bit more love, as per comments.
addresses multiple qa test failures. all qa tests now pass, and updated test performancecopilot#1976 checks for metric removal from a modified source that still persists in the config directory.
7267dfe
to
7b201d7
Compare
addresses multiple qa test failures. all qa tests
now pass, and updated test #1976 checks for metric removal from a modified source that still persists in the config directory.