Skip to content

Conversation

@vojtad
Copy link

@vojtad vojtad commented Aug 18, 2025

  • Job latency is now measured as a difference between scheduled and running time.
  • Tests are included.

Fix #21

- Job latency is now measured as a difference between scheduled and running time.
- Tests are included.

Fix Fullscript#21
@lewispb
Copy link

lewispb commented Oct 17, 2025

How far in advance do you schedule jobs? Given our largest histogram buckets are 30 minutes, 1 hour and 6 hours, how useful would the metric here really be?

I can see the utility in general, but we'd need a new metric with a new histogram bucket definition, perhaps measured in hours / days?

@vojtad
Copy link
Author

vojtad commented Oct 17, 2025

Jobs we use the wait time for are usually scheduled 5 - 30 minutes in the future to spread the load on resources.

However, for us this metric is about how long a job has to wait between when it was supposed to be executed and when it was actually executed. For this Yabeda::ActiveJob::LONG_RUNNING_JOB_RUNTIME_BUCKETS seems just fine to me. It doesn't depend on the wait time it was scheduled with if I understand this correctly, does it?

@lewispb
Copy link

lewispb commented Oct 17, 2025

Sorry, I misunderstood and thought you were looking to monitor time between when a job was originally scheduled and executed.

In that case it makes sense. Although I'm finding the Active Job notifications for scheduled jobs a bit confusing. Rhetorical question: why aren't they scheduled -> enqueued -> performed?

@vojtad
Copy link
Author

vojtad commented Oct 17, 2025

To me it makes sense. However, I am not a native English speaker and I could be missing something.

  1. enqueued_at is time when the job was put in the queue
  2. scheduled_at is time when the job is scheduled to run
  3. event.end for perform_start.active_job event is time when the job processing was started by the queue adapter

Another, probably interesting, metric would be latency from the job scheduled_at to when the job was done. perform.active_job event could be used for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

activejob_latency metric measuring wrong latency

2 participants