Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite score calculation in Rust #25

Open
nicoburns opened this issue Jul 4, 2024 · 5 comments
Open

Rewrite score calculation in Rust #25

nicoburns opened this issue Jul 4, 2024 · 5 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@nicoburns
Copy link
Contributor

Score recalc is pretty slow currently. Slow enough that I left it and came back to it several times and it still wasn't done. We should consider rewriting it in Rust for better performance. Such an implementation could:

  • Use streaming parsing for the JSON (and LZMA decompression) using a custom serde Deserialize implementation, avoiding allocating huge arrays of test results. This could be done both for top-level tests and sub-tests.
  • Avoid allocating test name strings entirely by matching directly against the values in the raw file (which will never include escaped characters).
@nicoburns nicoburns added enhancement New feature or request good first issue Good for newcomers labels Jul 4, 2024
@mukilan
Copy link
Member

mukilan commented Jul 4, 2024

I agree, rewriting in Rust makes sense. Originally we were planning to have the dashboard show the scores for just 1 year, with a fresh graph for each year like the interop dashboard, but now that we have a single graph, there is a lot of data to process.

Another option is to truncate and archive data beyond a cutoff date (which is what I do when testing changes locally).

@mukilan
Copy link
Member

mukilan commented Jul 4, 2024

Also, at some point we'll be dropping the data for Layout 2013.

@mrego
Copy link
Member

mrego commented Jul 30, 2024

We could also split the chart and only show 2024 (or only show the last 12 months) in the main view wpt.servo.org. We could even keep for future reference something like wpt.servo.org/2023 or whatever with the old data. If that helps to alleviate things and it's not too complex to implement.

@nicoburns
Copy link
Contributor Author

I've looked into this a little more. And I don't think RIIR will be a huge performance win for computing historical data, although I think it could still be a nice improvement for code quality / maintainability.

The performance bottleneck seems to be xz (de)compression which is using a C library in Node and is much the same speed in Rust. I'm getting ~1.5s just to decompress and deserialize xz compressed "run score" JSON files (as stored in this repo),(or ~400ms if operating on an already-decompressed JSON file), and 1.5s x 600+ files is going to be slow no matter what.

I think there are couple of things which would make a big difference here here:

  1. Compute (and store) upfront scores for:

    • All possible categories (correspond to subfolder paths) rather than only the defined "focus areas"
    • All known scoring methods:
      • 1 if all subtests pass. 0 otherwise.
      • 1 for each subtest
      • 1/num_subtests

    This would mean that we'd only need to recompute scores if the scoring methodology changes.

  2. Assign each test name an ID and only store IDs in the score files. The bulk of the score files are test names, so if we could store those once in total, rather than once-per-file that would save a bunch of space (and presumably parsing time). This is a bit awkward given our flat-file storage model (we would essentially be manually maintaining a relational database), and the fact that new tests are added over time and we'd need to maintain a consistent mapping across scoring runs. (this would work a lot better if we stored scores in a SQL database of some kind).

    This one might make sense if it was a cross-project effort at the WPT level?

I am tentatively planning to:

  • Proceed with 1, based on a Rust rewrite of the scoring (and other processing) code.
  • Split out a general-purpose "wptreport" crate with basic struct definitions for WPT reports. Which I also intend to use for generating reports in "wptreport" format in Blitz's WPT runner. And which we could use for further Servo tooling in future as required.
  • Add a cli interface to the scoring code for ad-hoc scoring of local WPT reports
  • Potentially also look into HTML report generation and/or an interactive drill-down viewer similar to the one on wpt.fyi

@mukilan
Copy link
Member

mukilan commented Dec 17, 2024

  1. Compute (and store) upfront scores for:

The scores for past runs change because we score all past runs against the current day's run. This is similar to what the Interop dashboard does to account for addition and deletion of tests and subtests. Is the proposal here to have pre-computed scores as an optimization for the case the tests & subtests in an "area" have not changed between runs, or just to not update the scores of past runs?

the fact that new tests are added over time and we'd need to maintain a consistent mapping across scoring runs.

The MANIFEST.json in the meta folder in servo repo is a mapping from tests to hashes based on test contents (I think, I might be wrong). Perhaps that can be exploited here? Just an idea.

This one might make sense if it was a cross-project effort at the WPT level?

One thing that I like about the current setup is that it is dead simple to deploy and has no dependencies like a database or server. I'm not against moving to a more complicated setup for Servo if we can gain a lot of new features. But at that point, I'm not sure if the whole thing should be a wpt.fyi 2.0 kind of project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants