You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Grafana dashboards here are outdated. It seems like they were last updated 2 years ago and that a lot of metrics referenced there have been renamed or are not being produced anymore.
I tried running make stats to generate an updated version, but I'm getting the following error:
Traceback (most recent call last):
File "/home/victor.delepine/.local/bin/generate-dashboard", line 8, in <module>
sys.exit(generate_dashboard_script())
File "/home/victor.delepine/.local/lib/python3.8/site-packages/grafanalib/_gen.py", line 242, in generate_dashboard_script
run_script(generate_dashboard)
File "/home/victor.delepine/.local/lib/python3.8/site-packages/grafanalib/_gen.py", line 80, in run_script
sys.exit(f(sys.argv[1:]))
File "/home/victor.delepine/.local/lib/python3.8/site-packages/grafanalib/_gen.py", line 223, in generate_dashboard
dashboard = loader(opts.dashboard)
File "/home/victor.delepine/.local/lib/python3.8/site-packages/grafanalib/_gen.py", line 74, in loader
raise DefinitionError(
grafanalib._gen.DefinitionError: Definition /tmp/flyte/stats/flytepropeller_dashboard.py does not define a variable '/tmp/flyte/stats/flytepropeller_dashboard'
make: *** [Makefile:62: stats] Error 1
(Please disregard the /tmp/ path, I was trying to check something else)
Example of outdated metrics: all the flyte:admin:database:postgres:repositories:* are now under flyte:admin:admin:database:*
Most metrics that measure durations like flyte:propeller:all:workflow:failure_duration_ms need to be prefixed by "unlabeled", like: flyte:propeller:all:workflow:failure_duration_unlabeled_ms.
A lot of Flyte Admin metrics need a second "admin" prefix: flyte:admin:list_launch_plan:codes:OK becomes flyte:admin:admin:list_launch_plan:codes:OK
Expected behavior
The Grafana dashboards should be in sync with the state of prometheus metrics in the repo
Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
Yes
Have you read the Code of Conduct?
Yes
The text was updated successfully, but these errors were encountered:
eapolinario
added
backlogged
For internal use. Reserved for contributor team workflow.
and removed
untriaged
This issues has not yet been looked at by the Maintainers
labels
Nov 2, 2023
Describe the bug
The Grafana dashboards here are outdated. It seems like they were last updated 2 years ago and that a lot of metrics referenced there have been renamed or are not being produced anymore.
I tried running
make stats
to generate an updated version, but I'm getting the following error:(Please disregard the
/tmp/
path, I was trying to check something else)Example of outdated metrics: all the flyte:admin:database:postgres:repositories:* are now under flyte:admin:admin:database:*
Most metrics that measure durations like
flyte:propeller:all:workflow:failure_duration_ms
need to be prefixed by "unlabeled", like:flyte:propeller:all:workflow:failure_duration_unlabeled_ms
.A lot of Flyte Admin metrics need a second "admin" prefix:
flyte:admin:list_launch_plan:codes:OK
becomesflyte:admin:admin:list_launch_plan:codes:OK
Expected behavior
The Grafana dashboards should be in sync with the state of prometheus metrics in the repo
Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: