Feat - cron queue for master commits #2163

Jamesbarford · 2025-06-18T10:48:34Z

Toggle the cron running using RUN_CRON environment variable
Change the frequency of the cron using environment variable QUEUE_UPDATE_INTERVAL_SECONDS
Every 30 seconds by default would populate the benchmark_requests table with master commits

Kobzol

Thank you! ~~I haven't tested it locally yet because it didn't compile (some things need to be shuffled around/fixed I think), but left a couple of comments. I'll test it then.~~ Nevermind, that was some Cargo bug.

database/src/lib.rs

database/src/pool/postgres.rs

database/src/pool/sqlite.rs

site/src/lib.rs

site/src/load.rs

site/src/queue_jobs.rs

Kobzol · 2025-06-18T12:48:59Z

called `Result::unwrap()` on an `Err` value: Error { kind: ToSql(1), cause: Some(WrongType { postgres: Int4, rust: "u32" }) }

Yup, guessed it right 😆

database/src/lib.rs

site/src/job_queue.rs

… best

Kobzol · 2025-06-19T12:34:51Z

site/src/job_queue.rs

+    let ctxt = site_ctxt.clone();
+    let mut interval = time::interval(Duration::from_secs(seconds));
+
+    if let Some(ctxt_clone) = {


This condition needs to be inside the loop. First we should tick the interval, than check if the ctxt is filled, and then run the cron_enqueue_jobs function. Both because the context is missing when cron_main is called (so the cron doesn't run at all), and also because if we took the context here, we would be using an old context for the whole duration of the cron job.

Ah! I think this commit should handle that; a145daa

I .clone() outside of the loop which I think makes sense to increment the reference count. Then do as you say inside the loop and have moved the interval.tick() to be the first call to be invoked in the loop

The clone is not necessary as you already hold a value of the Arc from the parameter, but it doesn't really matter.

Kobzol

Note: the created_at column is currently set to the date of master commits. I guess it doesn't matter that much, as in most cases it should be inserted into the table relatively soon after being merged. But it's a bit weird, since we probably treat created_at as the time when the actual entry in the table was created.

That also reminded me that we will want to show the total duration of the benchmark run on the website. This duration was simple before, but it gets more complicated with the job queue and backfilling. Possibly later we might want to add a column to benchmark_request that will store the duration of the whole benchmark? Essentially the time from the start of the first job to the completion of the last job, I suppose. But maybe we can reuse artifact_collection_duration for that. Anyway, we can deal with that later.

site/src/job_queue.rs

Kobzol · 2025-06-19T13:07:58Z

I'm fine with merging the current state, although I wouldn't enable it yet on production (it seems to work fine locally). Because currently we take all master commits from June onwards, and try to insert them all into the DB. This does in fact do 100+ (with 5 more each day) insert queries to the DB, every 30 seconds, which seems a bit wasteful for the production server for now.

I think that before enabling it on production, we should at least check that the inserted SHA is not in site_ctxt.index.commits, or create our own index of completed benchmark requests, because in the new scheme checking just the artifacts table is not enough.

Jamesbarford · 2025-06-19T13:19:02Z

I'm fine with merging the current state, although I wouldn't enable it yet on production (it seems to work fine locally). Because currently we take all master commits from June onwards, and try to insert them all into the DB. This does in fact do 100+ (with 5 more each day) insert queries to the DB, every 30 seconds, which seems a bit wasteful for the production server for now.

I think that before enabling it on production, we should at least check that the inserted SHA is not in site_ctxt.index.commits, or create our own index of completed benchmark requests, because in the new scheme checking just the artifacts table is not enough.

We should be shielded from anything happening from the RUN_CRON environment variable. To make sure I am following you're suggesting doing something like;

let commits: Vec<Commits> = ctxt.index.commits();

if !commits.find(...)  {
   // insert master commit
}

I'm happy with either, I think as we have a test which shows that items can get inserted in the database and we at present only have a simple insert operation; waiting to enable it in production seems ok?

Kobzol

Yeah, it shouldn't run by default, I just wanted to mention that I'll only enable the environment variable or make the cron job run by default once we can filter the existing commits quickly, to avoid the useless queries.

Thank you!

Kobzol reviewed Jun 18, 2025

View reviewed changes

Jamesbarford force-pushed the feat/queue-cron branch from c112c30 to 4b373d2 Compare June 18, 2025 14:48

Kobzol reviewed Jun 19, 2025

View reviewed changes

Jamesbarford added 7 commits June 19, 2025 09:59

Feat; enqueue master commits, cron written but not wired up

e45236b

wire up the cron

81c82ef

Only add newer master commits

79a22c9

update typo

0b14587

actually commit the queue_jobs file

307ecfb

PR feedback & tests

3288e0e

allow running postgres tests only

7c0b968

Jamesbarford force-pushed the feat/queue-cron branch 2 times, most recently from fe73bd9 to 4fca54f Compare June 19, 2025 09:05

Pr Feedback; code tidy up & cron function refactor

cac2e28

Jamesbarford force-pushed the feat/queue-cron branch from 4fca54f to cac2e28 Compare June 19, 2025 09:15

modify the index based off the investigations into what seems to work…

dbfe582

… best

Kobzol reviewed Jun 19, 2025

View reviewed changes

site/src/job_queue.rs Outdated Show resolved Hide resolved

Move getting the context to inside the cron loop

a145daa

Jamesbarford force-pushed the feat/queue-cron branch from 8930a3b to a145daa Compare June 19, 2025 12:47

Kobzol approved these changes Jun 19, 2025

View reviewed changes

Kobzol merged commit 39fab1a into rust-lang:master Jun 19, 2025
11 checks passed

Jamesbarford deleted the feat/queue-cron branch June 19, 2025 13:42

Jamesbarford mentioned this pull request Jun 20, 2025

rustc-perf improvements rust-lang/rust-project-goals#275

Open

7 tasks

Feat - cron queue for master commits #2163

Feat - cron queue for master commits #2163

Uh oh!

Conversation

Jamesbarford commented Jun 18, 2025

Uh oh!

Kobzol left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Kobzol commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Kobzol Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Jamesbarford Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Kobzol Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Kobzol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Kobzol commented Jun 19, 2025

Uh oh!

Jamesbarford commented Jun 19, 2025

Uh oh!

Kobzol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Kobzol left a comment •

edited

Loading

Kobzol commented Jun 18, 2025 •

edited

Loading