Skip to content

Conversation

@jianlinliu
Copy link
Contributor

@jianlinliu jianlinliu commented Nov 3, 2025

extract operator Progressing / Degraded Counts and Timing from intervals, collect them and save them into a auto data loader json file for historical analysis.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 3, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 3, 2025

@jianlinliu: This pull request references TRT-2254 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from p0lyn0mial and sjenning November 3, 2025 02:54
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 3, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jianlinliu
Once this PR has been reviewed and has the lgtm label, please assign smg247 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-trt
Copy link

openshift-trt bot commented Nov 3, 2025

Job Failure Risk Analysis for sha: f01fda0

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-gcp-csi IncompleteTests
Tests for this run (106) are below the historical average (1798): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn IncompleteTests
Tests for this run (105) are below the historical average (3244): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-gcp-ovn-upgrade IncompleteTests
Tests for this run (106) are below the historical average (1801): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6 IncompleteTests
Tests for this run (101) are below the historical average (3006): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-vsphere-ovn IncompleteTests
Tests for this run (103) are below the historical average (3313): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-main-e2e-vsphere-ovn-upi IncompleteTests
Tests for this run (103) are below the historical average (3351): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

@openshift-trt
Copy link

openshift-trt bot commented Nov 3, 2025

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New tests seen in this PR at sha: 11e2b1c

  • "[Monitor:operator-state-metrics-analyzer][Jira:"Test Framework"] monitor test operator-state-metrics-analyzer cleanup" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:operator-state-metrics-analyzer][Jira:"Test Framework"] monitor test operator-state-metrics-analyzer collection" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:operator-state-metrics-analyzer][Jira:"Test Framework"] monitor test operator-state-metrics-analyzer interval construction" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:operator-state-metrics-analyzer][Jira:"Test Framework"] monitor test operator-state-metrics-analyzer preparation" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:operator-state-metrics-analyzer][Jira:"Test Framework"] monitor test operator-state-metrics-analyzer setup" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:operator-state-metrics-analyzer][Jira:"Test Framework"] monitor test operator-state-metrics-analyzer test evaluation" [Total: 12, Pass: 12, Fail: 0, Flake: 0]
  • "[Monitor:operator-state-metrics-analyzer][Jira:"Test Framework"] monitor test operator-state-metrics-analyzer writing to storage" [Total: 12, Pass: 12, Fail: 0, Flake: 0]

@jianlinliu
Copy link
Contributor Author

/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 4, 2025

@jianlinliu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fe5001a0-b94a-11f0-82fa-e1d6c7ee712e-0

@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 5, 2025

@jianlinliu: This pull request references TRT-2254 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

extract operator Progressing / Degraded Counts and Timing from intervals, collect them and save them into a auto data loader json file for historical analysis.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jianlinliu
Copy link
Contributor Author

/test unit

@jianlinliu
Copy link
Contributor Author

/test e2e-aws-ovn-microshift-serial

@jianlinliu
Copy link
Contributor Author

/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 5, 2025

@jianlinliu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/3a563760-b9ea-11f0-8835-258702823538-0

@jianlinliu
Copy link
Contributor Author

/test e2e-gcp-ovn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 5, 2025

@jianlinliu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-ovn c5b8028 link true /test e2e-gcp-ovn
ci/prow/e2e-aws-ovn-serial-2of2 c5b8028 link true /test e2e-aws-ovn-serial-2of2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jianlinliu
Copy link
Contributor Author

jianlinliu commented Nov 5, 2025

From the metrics autodl json file, it was generated as expectation.

if len(metrics) > 0 {
rows := generateRowsFromMetrics(metrics)
dataFile := dataloader.DataFile{
TableName: "operator_state_metrics",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if instead of the generic "Metric" we should have defined "Count", "TotalSeconds" and maybe "MinSeconds" and "MaxSeconds" instead of "IndividualDurationSeconds". Will see if others have thoughts on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the consensus was to make this a single row per operator/condition tracking

"Count", "TotalSeconds" and "MaxIndividualDurationSeconds"

if err := dataloader.WriteDataFile(fileName, dataFile); err != nil {
return fmt.Errorf("failed to write operator state metrics: %w", err)
}
fmt.Printf("--->Write operator state metrics to %s successfully.\n", fileName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to encourage the use of logrus.Infof for this, clean syntax and ensures we get timestamps for debugging purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, actually that line was added for debugging, sure, I will update it to use logrus.Infof.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants