perfomance regression from `6.0.2` to `main` #186

kitsuniru · 2025-01-01T22:14:45Z

Greetings, i'm experiencing performance regression in my code after updating to latest main branch from 6.0.2

I have measured both versions 20 times each and the results was consistently regressive
There were no changes to the code between them. The only change was to the petitparser version in the pubspec file (from ^6.0.2 to the git main branch)

6.0.2:

chunkSize: 21869061
[Isolate 0] Read: Range(0...21869075)
...
[Isolate 15] Read: Range(328036163...349970780)
length: 349904988
8868744
----------------------------

Days              : 0
Hours             : 0
Minutes           : 1
Seconds           : 47     <-----
Milliseconds      : 593
Ticks             : 1075930866
TotalDays         : 0,00124529035416667
TotalHours        : 0,0298869685
TotalMinutes      : 1,79321811
TotalMilliseconds : 107593,0866

main branch (latest commit):

chunkSize: 21869061
[Isolate 0] Read: Range(0...21869075)
...
[Isolate 15] Read: Range(328036163...349970780)
length: 349904988
8868744
----------------------------


Days              : 0
Hours             : 0
Minutes           : 1
Seconds           : 53     <-----
Milliseconds      : 375
Ticks             : 1133753292
TotalDays         : 0,00131221445833333
TotalHours        : 0,031493147
TotalMinutes      : 1,88958882
TotalSeconds      : 113,3753292
TotalMilliseconds : 113375,3292

The text was updated successfully, but these errors were encountered:

kitsuniru · 2025-01-01T22:16:10Z

Measurements were obtained using the following PowerShell script:

Measure-Command { .\parser.exe | Out-Default }

renggli · 2025-01-01T22:38:27Z

Thank you for highlighting.

Can you provide the commit hash from the main branch that you used for testing? There were many changes that could affect performance, and some more recent ones I believe could improve performance again.

What are you measuring when collecting the benchmark? Could you provide a minimally reproducible example?

kitsuniru · 2025-01-01T23:56:46Z

For benchmarking, I have left only the file reading and parsing functionalities.

The file is split into 16 parts and read in parallel in 16 isolates through a separator. (semicolon in our case)

The parser, built from the grammar definition, always returns one object (meaning that one object is placed in each subchunk between semicolons).

There are no delays in reading the file (as it is small, around 400 MB). The main time is spent parsing and tokenizing the output.
There are no difference between SSD and HDD, measurements are same

I can give you access to my experimental private repo with whole code, it looks kinda trashy and unstructured, but only yet

kitsuniru · 2025-01-02T01:34:20Z

@renggli

It may sound strange but looks like that the perfomance was regressed at the 5e944abe55945cab53b045453e835a610aa8d1d4 commit

renggli · 2025-01-02T08:14:13Z

Thank you for the clarification. 5e944ab is an interesting find, in my quick tests I found that to improve the speed by around 5% (but I am aware of some older regressions). I will run some of my standard benchmarks now.

renggli · 2025-01-02T16:01:58Z

I've updated and run my usual benchmarks on the new development branch: https://docs.google.com/spreadsheets/d/15rRvLnmjT3rhGLP1iEYv5QF1JVMZmPksIjmFysDnnz0/edit?usp=sharing

The problem really is that the unicode parsing requires to duplicate each parser that reads individual characters, and I tried to avoid that in some cases. Reintroducing the specialised any-parser to see if that solves at least the regressions in the existing parsers.

kitsuniru · 2025-01-02T17:23:44Z

Tried latest commit 517c63b

517c63b commit:

Seconds           : 38
Milliseconds      : 321
Ticks             : 383218801

6.0.2 version:

Seconds           : 35
Milliseconds      : 399
Ticks             : 353998834

all values are averaged, I ran the measurements several times, but 517c63b commit gave me 37-39 seconds, while 6.0.2 gave 34-36

renggli · 2025-01-03T17:05:57Z

The performance regression in parse-speed seems to happen with 06cfe90. Not clear why, because this change only really changes the way the parsers are instantiated. Maybe it prevents some optimizations?

renggli · 2025-01-05T16:41:20Z

I have spent the last couple of days to revert line by line and still have no clue where the performance regression comes from. For existing parsers that do not use unicode character decoding, the code being executed should be exactly the same, only some extra parsers were added. Maybe that prevents some optimisations?

The effect on complicated grammars is small, but clearly measurable:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perfomance regression from `6.0.2` to `main` #186

perfomance regression from `6.0.2` to `main` #186

kitsuniru commented Jan 1, 2025 •

edited

Loading

kitsuniru commented Jan 1, 2025 •

edited

Loading

renggli commented Jan 1, 2025 •

edited

Loading

kitsuniru commented Jan 1, 2025 •

edited

Loading

kitsuniru commented Jan 2, 2025 •

edited

Loading

renggli commented Jan 2, 2025

renggli commented Jan 2, 2025

kitsuniru commented Jan 2, 2025 •

edited

Loading

renggli commented Jan 3, 2025

renggli commented Jan 5, 2025 •

edited

Loading

perfomance regression from 6.0.2 to main #186

perfomance regression from 6.0.2 to main #186

Comments

kitsuniru commented Jan 1, 2025 • edited Loading

kitsuniru commented Jan 1, 2025 • edited Loading

renggli commented Jan 1, 2025 • edited Loading

kitsuniru commented Jan 1, 2025 • edited Loading

kitsuniru commented Jan 2, 2025 • edited Loading

renggli commented Jan 2, 2025

renggli commented Jan 2, 2025

kitsuniru commented Jan 2, 2025 • edited Loading

renggli commented Jan 3, 2025

renggli commented Jan 5, 2025 • edited Loading

perfomance regression from `6.0.2` to `main` #186

perfomance regression from `6.0.2` to `main` #186

kitsuniru commented Jan 1, 2025 •

edited

Loading

kitsuniru commented Jan 1, 2025 •

edited

Loading

renggli commented Jan 1, 2025 •

edited

Loading

kitsuniru commented Jan 1, 2025 •

edited

Loading

kitsuniru commented Jan 2, 2025 •

edited

Loading

kitsuniru commented Jan 2, 2025 •

edited

Loading

renggli commented Jan 5, 2025 •

edited

Loading