Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of grapheme frequency calculation #133

Merged
merged 10 commits into from
Jun 20, 2024

Conversation

vcfxb
Copy link
Contributor

@vcfxb vcfxb commented Jun 20, 2024

This PR improves the performance of grapheme frequency calculation by replacing several single threaded operations with parallel operations using rayon. Additionally we switch from using unicode_segmentation to finl_unicode to improve performance of grapheme cluster identification and iteration. On my machine, this all results in about a 50% total runtime reduction when checking npm/express.

@vcfxb vcfxb requested review from alilleybrinker and j-lanson June 20, 2024 16:28
@vcfxb vcfxb self-assigned this Jun 20, 2024
@alilleybrinker alilleybrinker merged commit d302a24 into main Jun 20, 2024
9 checks passed
@alilleybrinker
Copy link
Collaborator

Thanks @vcfxb!

continue;
}

// Count the number of graphemes in this patch, add it to the tortal,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*total

let mut total_graphemes: usize = 0;

// Iterate over the file diffs by reference.
for file_diff in &commit_diff.diff.file_diffs {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask if it be possible to parallelize over the file iterator as well, but I figured I should figure out the answer myself before suggesting anything. Have a PR up in #134 to demonstrate how this could be done. I don't have/know the benchmarking setup so I haven't tested it locally myself to see what the savings are, if any.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants