Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up processReadChunk by alternating bytes.IndexByte calls #4

Open
JensRantil opened this issue Feb 25, 2024 · 0 comments
Open

Comments

@JensRantil
Copy link

JensRantil commented Feb 25, 2024

I would not be surprised if you could speed up https://github.com/shraddhaag/1brc/blob/8513d5e70a1bfabbf46ab86a9cb6558bc9805154/main.go#L187C6-L187C22 by alternating calling bytes.IndexByte(buf[lastTokenEnd:], ';'), followed by bytes.IndexByte(buf[lastTokenEnd:], '\n'). bytes.IndexByte function is highly optimized for fast searching in a byte slice. Since it is the inner loop it can have a big impact.

You could probably also skip the looking at the first 1 byte for station name and temperature (since it's at least one character). From my experience not inspecting bytes can be a very efficient way of speeding up parsing.

Also, I would convert from []byte to string only when I need to to avoid any kind of unicode handling.

@JensRantil JensRantil changed the title Speed up processReadChunk by alternating strings.IndexByte Speed up processReadChunk by alternating bytes.IndexByte calls Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant