[Monitoring] Adding a metric for task outcome #4458

vitorguidi · 2024-11-27T18:10:18Z

Motivation

We currently have no metric that tracks the error rate for each task. This PR implements that, and the error rate can be obtained by summing up the metric with outcome=failure, divided by the overall sum.

This is useful for SLI alerting.

Part of #4271

jonathanmetzman

utask_main is run with utasks.uworker_bot_main You might want to catch this too.

jonathanmetzman · 2024-11-27T20:36:02Z

src/clusterfuzz/_internal/bot/tasks/commands.py

  except BaseException:
    # On any other exceptions, update state to reflect error and re-raise.
    rate_limiter.record_task(success=False)
+    _emit_task_outcome_metric(task_name, job_name, 'failure')


This will record preprocess and postprocess as tasks if that's what we want.

vitorguidi added 2 commits November 27, 2024 18:08

Adding task outcome metric definition

19d045b

Emiting metric outcome metric

3db8f71

vitorguidi requested review from alhijazi, jonathanmetzman and oliverchang November 27, 2024 18:10

vitorguidi added 2 commits November 27, 2024 18:11

Fix lint

350a49b

Moving task outcome to correct position

872273a

jonathanmetzman reviewed Nov 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Monitoring] Adding a metric for task outcome #4458

[Monitoring] Adding a metric for task outcome #4458

vitorguidi commented Nov 27, 2024 •

edited

Loading

jonathanmetzman left a comment

jonathanmetzman Nov 27, 2024

[Monitoring] Adding a metric for task outcome #4458

Are you sure you want to change the base?

[Monitoring] Adding a metric for task outcome #4458

Conversation

vitorguidi commented Nov 27, 2024 • edited Loading

Motivation

jonathanmetzman left a comment

Choose a reason for hiding this comment

jonathanmetzman Nov 27, 2024

Choose a reason for hiding this comment

vitorguidi commented Nov 27, 2024 •

edited

Loading