Speed up `processReadChunk` by alternating bytes.IndexByte calls #4

JensRantil · 2024-02-25T21:26:29Z

I would not be surprised if you could speed up https://github.com/shraddhaag/1brc/blob/8513d5e70a1bfabbf46ab86a9cb6558bc9805154/main.go#L187C6-L187C22 by alternating calling bytes.IndexByte(buf[lastTokenEnd:], ';'), followed by bytes.IndexByte(buf[lastTokenEnd:], '\n'). bytes.IndexByte function is highly optimized for fast searching in a byte slice. Since it is the inner loop it can have a big impact.

You could probably also skip the looking at the first 1 byte for station name and temperature (since it's at least one character). From my experience not inspecting bytes can be a very efficient way of speeding up parsing.

Also, I would convert from []byte to string only when I need to to avoid any kind of unicode handling.

The text was updated successfully, but these errors were encountered:

JensRantil changed the title ~~Speed up processReadChunk by alternating strings.IndexByte~~ Speed up processReadChunk by alternating bytes.IndexByte calls Feb 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up `processReadChunk` by alternating bytes.IndexByte calls #4

Speed up `processReadChunk` by alternating bytes.IndexByte calls #4

JensRantil commented Feb 25, 2024 •

edited

Loading

Speed up processReadChunk by alternating bytes.IndexByte calls #4

Speed up processReadChunk by alternating bytes.IndexByte calls #4

Comments

JensRantil commented Feb 25, 2024 • edited Loading

Speed up `processReadChunk` by alternating bytes.IndexByte calls #4

Speed up `processReadChunk` by alternating bytes.IndexByte calls #4

JensRantil commented Feb 25, 2024 •

edited

Loading