-
Notifications
You must be signed in to change notification settings - Fork 4.8k
TRT-2254: extract Operator Progressing / Degraded Counts and Timing #30449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@jianlinliu: This pull request references TRT-2254 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jianlinliu The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Job Failure Risk Analysis for sha: f01fda0
|
|
Risk analysis has seen new tests most likely introduced by this PR. New tests seen in this PR at sha: 11e2b1c
|
|
/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade |
|
@jianlinliu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fe5001a0-b94a-11f0-82fa-e1d6c7ee712e-0 |
|
@jianlinliu: This pull request references TRT-2254 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test unit |
|
/test e2e-aws-ovn-microshift-serial |
|
/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade |
|
@jianlinliu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/3a563760-b9ea-11f0-8835-258702823538-0 |
|
/test e2e-gcp-ovn |
|
@jianlinliu: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
From the metrics autodl json file, it was generated as expectation. |
| if len(metrics) > 0 { | ||
| rows := generateRowsFromMetrics(metrics) | ||
| dataFile := dataloader.DataFile{ | ||
| TableName: "operator_state_metrics", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if instead of the generic "Metric" we should have defined "Count", "TotalSeconds" and maybe "MinSeconds" and "MaxSeconds" instead of "IndividualDurationSeconds". Will see if others have thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the consensus was to make this a single row per operator/condition tracking
"Count", "TotalSeconds" and "MaxIndividualDurationSeconds"
| if err := dataloader.WriteDataFile(fileName, dataFile); err != nil { | ||
| return fmt.Errorf("failed to write operator state metrics: %w", err) | ||
| } | ||
| fmt.Printf("--->Write operator state metrics to %s successfully.\n", fileName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to encourage the use of logrus.Infof for this, clean syntax and ensures we get timestamps for debugging purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, actually that line was added for debugging, sure, I will update it to use logrus.Infof.
extract operator Progressing / Degraded Counts and Timing from intervals, collect them and save them into a auto data loader json file for historical analysis.