Rewrite score calculation in Rust #25

nicoburns · 2024-07-04T06:31:36Z

Score recalc is pretty slow currently. Slow enough that I left it and came back to it several times and it still wasn't done. We should consider rewriting it in Rust for better performance. Such an implementation could:

Use streaming parsing for the JSON (and LZMA decompression) using a custom serde Deserialize implementation, avoiding allocating huge arrays of test results. This could be done both for top-level tests and sub-tests.
Avoid allocating test name strings entirely by matching directly against the values in the raw file (which will never include escaped characters).

The text was updated successfully, but these errors were encountered:

mukilan · 2024-07-04T06:46:16Z

I agree, rewriting in Rust makes sense. Originally we were planning to have the dashboard show the scores for just 1 year, with a fresh graph for each year like the interop dashboard, but now that we have a single graph, there is a lot of data to process.

Another option is to truncate and archive data beyond a cutoff date (which is what I do when testing changes locally).

mukilan · 2024-07-04T06:53:47Z

Also, at some point we'll be dropping the data for Layout 2013.

mrego · 2024-07-30T14:37:41Z

We could also split the chart and only show 2024 (or only show the last 12 months) in the main view wpt.servo.org. We could even keep for future reference something like wpt.servo.org/2023 or whatever with the old data. If that helps to alleviate things and it's not too complex to implement.

nicoburns · 2024-12-15T22:25:20Z

I've looked into this a little more. And I don't think RIIR will be a huge performance win for computing historical data, although I think it could still be a nice improvement for code quality / maintainability.

The performance bottleneck seems to be xz (de)compression which is using a C library in Node and is much the same speed in Rust. I'm getting ~1.5s just to decompress and deserialize xz compressed "run score" JSON files (as stored in this repo),(or ~400ms if operating on an already-decompressed JSON file), and 1.5s x 600+ files is going to be slow no matter what.

I think there are couple of things which would make a big difference here here:

Compute (and store) upfront scores for:
- All possible categories (correspond to subfolder paths) rather than only the defined "focus areas"
- All known scoring methods:
  - 1 if all subtests pass. 0 otherwise.
  - 1 for each subtest
  - 1/num_subtests
This would mean that we'd only need to recompute scores if the scoring methodology changes.
Assign each test name an ID and only store IDs in the score files. The bulk of the score files are test names, so if we could store those once in total, rather than once-per-file that would save a bunch of space (and presumably parsing time). This is a bit awkward given our flat-file storage model (we would essentially be manually maintaining a relational database), and the fact that new tests are added over time and we'd need to maintain a consistent mapping across scoring runs. (this would work a lot better if we stored scores in a SQL database of some kind).

This one might make sense if it was a cross-project effort at the WPT level?

I am tentatively planning to:

Proceed with 1, based on a Rust rewrite of the scoring (and other processing) code.
Split out a general-purpose "wptreport" crate with basic struct definitions for WPT reports. Which I also intend to use for generating reports in "wptreport" format in Blitz's WPT runner. And which we could use for further Servo tooling in future as required.
Add a cli interface to the scoring code for ad-hoc scoring of local WPT reports
Potentially also look into HTML report generation and/or an interactive drill-down viewer similar to the one on wpt.fyi

mukilan · 2024-12-17T11:07:18Z

Compute (and store) upfront scores for:

The scores for past runs change because we score all past runs against the current day's run. This is similar to what the Interop dashboard does to account for addition and deletion of tests and subtests. Is the proposal here to have pre-computed scores as an optimization for the case the tests & subtests in an "area" have not changed between runs, or just to not update the scores of past runs?

the fact that new tests are added over time and we'd need to maintain a consistent mapping across scoring runs.

The MANIFEST.json in the meta folder in servo repo is a mapping from tests to hashes based on test contents (I think, I might be wrong). Perhaps that can be exploited here? Just an idea.

This one might make sense if it was a cross-project effort at the WPT level?

One thing that I like about the current setup is that it is dead simple to deploy and has no dependencies like a database or server. I'm not against moving to a more complicated setup for Servo if we can gain a lot of new features. But at that point, I'm not sure if the whole thing should be a wpt.fyi 2.0 kind of project.

nicoburns added enhancement New feature or request good first issue Good for newcomers labels Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite score calculation in Rust #25

Rewrite score calculation in Rust #25

nicoburns commented Jul 4, 2024

mukilan commented Jul 4, 2024 •

edited

Loading

mukilan commented Jul 4, 2024

mrego commented Jul 30, 2024

nicoburns commented Dec 15, 2024

mukilan commented Dec 17, 2024 •

edited

Loading

Rewrite score calculation in Rust #25

Rewrite score calculation in Rust #25

Comments

nicoburns commented Jul 4, 2024

mukilan commented Jul 4, 2024 • edited Loading

mukilan commented Jul 4, 2024

mrego commented Jul 30, 2024

nicoburns commented Dec 15, 2024

mukilan commented Dec 17, 2024 • edited Loading

mukilan commented Jul 4, 2024 •

edited

Loading

mukilan commented Dec 17, 2024 •

edited

Loading