Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrapency in data on Grafana vs result-summary for stable-rt #745

Closed
musamaanjum opened this issue Aug 7, 2024 · 11 comments
Closed

Discrapency in data on Grafana vs result-summary for stable-rt #745

musamaanjum opened this issue Aug 7, 2024 · 11 comments
Assignees

Comments

@musamaanjum
Copy link
Contributor

Maintainers are already using Grafana dashboard. There was a report that preempt_rt config builds are missing (kernelci/kernelci-core#2397 (comment)). I've investigated and found out that the builds data is visible in results obtained from result-summary.

https://grafana.kernelci.org/d/OKXc44EIz/home?orgId=1&var-origin=maestro&var-tree=stable-rt&var-branch=All&var-test_path_regex=%25&var-platform=%25&var-config=%25&var-datasource=cdmoe4lcafu2od

I'll attach the results file from result-summary below in the comments as it isn't attached here.

The discrepancies are as follows:

  1. The date is different on both. Grafana shows 2024-08-07 while result-summary shows 2024-08-06.
  2. The preempt_rt jobs aren't present on Grafana and preempt_rt isn't present on config column. Probably preempt_rt jobs are missing.
  3. Grafana only shows reuslts for v6.6.44-rt39 branch. Other 2-3 branches are missing.

cc: @nuclearcat @padovan

@musamaanjum
Copy link
Contributor Author

stable-rt.html.log

@helen-fornazier I'm unable to assign this issue to you. Please have a look at what is causing the discrapency.

@helen-fornazier
Copy link
Contributor

I see all these branches for stable-rt

image

what is missing?

But indeed, I wans't able to find node 66abc518e49a7366b292a076 in KCIDB for instance (which is present in the report you sent). @JenySadadia could you check please?

Also, shouldn't these node_timeouts be a MISS ?

image

@helen-fornazier
Copy link
Contributor

about the MISS, I just noticed, these are build errors, we need this kernelci/kcidb-io#82

@musamaanjum
Copy link
Contributor Author

But indeed, I wans't able to find node 66abc518e49a7366b292a076 in KCIDB for instance (which is present in the report you sent). @JenySadadia could you check please?

@helen-fornazier @JenySadadia This is my only concern at this time. The data should have been the same at both places.

@JenySadadia
Copy link
Collaborator

But indeed, I wans't able to find node 66abc518e49a7366b292a076 in KCIDB for instance (which is present in the report you sent). @JenySadadia could you check please?

Yes, I am unable to find https://staging.kernelci.org:9000/viewer?node_id=66abc518e49a7366b292a076 on KCIDB dashboard. But it is present in the new grafana dashboard. Right?
If so, maestro did send the data and KCIDB dashboard is not showing it somehow.
Could you please check? @spbnick

@JenySadadia
Copy link
Collaborator

I checked staging logs.
Maestro didn't submit this entry. Then how did it reach to the new dashboard?
Is that any other source submitting maestro data to it? @helen-fornazier

@helen-fornazier
Copy link
Contributor

helen-fornazier commented Aug 13, 2024

Let me clarify things:

about 66abc518e49a7366b292a076:

So the question is: why it is not in KCIDB ? , why maestro didn't submit it ? Why do we have this inconsistency? (cc @JenySadadia )


image

image

@JenySadadia
Copy link
Collaborator

Hello @helen-fornazier @musamaanjum

I analyzed the staging logs and found the root cause.
From the logs, kcidb bridge service crashed on 08/01/2024 06:17:29 PM UTC and restarted on 08/02/2024 12:16:55 AM UTC.

The node https://staging.kernelci.org:9000/viewer?node_id=66abc518e49a7366b292a076 was updated at 2024-08-01 08:08:57 PM UTC.
That's why we lost the updated event from API as bridge service was not running at that time. Hence, KCIDB submission is missing for the node.

This issue has been partially taken care of by a patch that auto-restarts all the pipeline services after a crash.
The patch has been merged and deployed on 2nd Aug.

@musamaanjum
Copy link
Contributor Author

musamaanjum commented Aug 15, 2024

I've checked stable-rt. There hasn't been any update for 8 days. Let's wait to see if we get correct and coherent results on Grafana on the next run.

@crazoes
Copy link

crazoes commented Oct 1, 2024

@musamaanjum @helen-fornazier @JenySadadia can we close this task if it has been resolved?

@musamaanjum
Copy link
Contributor Author

Grafana has been working fine from quite some time. Closing the ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants