-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-1911 CLI metrics #106
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #106 +/- ##
=======================================
Coverage 22.66% 22.66%
=======================================
Files 10 10
Lines 1337 1337
=======================================
Hits 303 303
Misses 1015 1015
Partials 19 19
Flags with carried forward coverage won't be shown. Click here to find out more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments, no big blocker (just worrying in case things go wrong when stopping capture, trying to imagine some worst case scenario, what would happen)
Dockerfile
Outdated
@@ -24,6 +24,12 @@ COPY .mk/ .mk/ | |||
# Build collector | |||
RUN GOARCH=$TARGETARCH make compile | |||
|
|||
# Install oc to allow collector to run commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess these changes now need to be mirrored into Dockerfile.downstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm .. am I missing something? I don't see it in Dockerfile.downstream :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in #139
cmd/root.go
Outdated
resetTerminal() | ||
out, err := exec.Command("/oc-netobserv", "stop").Output() | ||
if err != nil { | ||
log.Fatal(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if there's more we could do here if the command failed. I'm afraid not stopping the capture may have quite bad consequences. Could we find another approach to kill everything? Or do some retry? (perhaps by setting captureEnded back to false?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can put a loop and retry every X seconds indeed.
The only alternative way I have in mind would be to create a Job outside of this collector Pod to invoke the oc delete daemonset command... However, the job will need to get from somewhere the current namespace + the capture ended trigger so I'm not sure it's better somehow.
/ok-to-test |
New image: It will expire after two weeks. To use this build, update your commands using: USER=netobserv VERSION=0c2012f make commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpinsonneau I tested this, I think labelling the CLI namespace to have openshift.io/cluster-monitoring: "true" is missing, I could only get metrics to be scraped after I added that label. Not sure if that's going to be only downstream thing.
Co-authored-by: Joel Takvorian <[email protected]>
Co-authored-by: Joel Takvorian <[email protected]>
New changes are detected. LGTM label has been removed. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
That's right @memodi. I have rebased and forced the label in the namespace creation. Thanks ! |
/label qe-approved |
Description
Add metrics capture to CLI. Use:
to run a capture and generate a dashboard.
Dependencies
Based on #103
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.