Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use gix for parsing git information in mitre/git plugin #663

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

patrickjcasey
Copy link
Contributor

@patrickjcasey patrickjcasey commented Nov 26, 2024

This pull request removes the need for a custom nom-based parser for parsing git information in the mitre/git plugin, by swapping over to using gix https://github.com/GitoxideLabs/gitoxide for parsing git information. As a part of this change, I got some data from the following repos:

testing on https://github.com/linux/torvalds was skipped for time reasons

I investigated using libgit2 first, but the performance on larger repos was not ideal, and it was difficult to integrate caching into the code. gix (https://github.com/GitoxideLabs/gitoxide) has better support for caching and is written in pure Rust which is also nice for memory safety reasons.

for determining execution speed, the following command was run with the release version of hc on my M1 Macbook Pro with 32 GB RAM, the mitre/git plugin was also configured to run the release version:

time ./target/release/hc check --policy config/Hipcheck.kdl <REPO>

for testing main, commit 4f205afd6ba0afd98d0ba91dc742f2038f6ddc89 was used

Branch Repo # of Commits # of CommitDiffs # of Contributors Last Commit Execution Speed
main mitre/hipcheck 529 529 20 2024-12-09T22:58:27Z 9.19 s
gix mitre/hipcheck 529 529 20 2024-12-09T22:58:27Z 9.98 s
main numpy/numpy 28436 28436 1898 2024-12-09T20:49:11Z 2m 49s
gix numpy/numpy 37617 37617 2243 2024-12-09T20:49:11Z 4m 38s

git rev-list --count HEAD results:

  • mitre/hipcheck: 529 is returned, which matches both implementations
  • numpy/numpy: 37617 is returned, which matches the gix implementation

git log --pretty="%an %ae%n%cn %ce" | LC_ALL=C sort -u | wc -l results:

  • mitre/hipcheck: 20 is returned, which matches both implementations
  • numpy/numpy: 2243 is returned, which matches gix implementation

The main highlight of this pull request is the fact that the parsing of git-related information is handled by a git library, rather than our nom-based parser, which was missing commits and contributors for larger repos. At some point, a pass can be made to dive into the performance of the git plugin to find places for optimization

@patrickjcasey patrickjcasey force-pushed the patrickjcasey/replace-nom-parser-with-libgit2 branch from fa18ac5 to dbb86eb Compare November 26, 2024 19:31
@patrickjcasey patrickjcasey changed the title feat: use libgit2 for parsing git information in mitre/git plugin feat: use gix for parsing git information in mitre/git plugin Dec 10, 2024
@patrickjcasey patrickjcasey force-pushed the patrickjcasey/replace-nom-parser-with-libgit2 branch 4 times, most recently from 3dec9dc to ece0bb6 Compare December 10, 2024 18:48
@patrickjcasey
Copy link
Contributor Author

Closes #575

plugins/git/src/git.rs Outdated Show resolved Hide resolved
@patrickjcasey patrickjcasey force-pushed the patrickjcasey/replace-nom-parser-with-libgit2 branch 3 times, most recently from 36f3d7a to 521431e Compare December 13, 2024 20:31
@patrickjcasey patrickjcasey force-pushed the patrickjcasey/replace-nom-parser-with-libgit2 branch from 521431e to c5c049e Compare December 13, 2024 20:33
Copy link
Collaborator

@j-lanson j-lanson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Patrick!

@patrickjcasey patrickjcasey merged commit 99d8278 into main Dec 13, 2024
10 checks passed
@patrickjcasey patrickjcasey deleted the patrickjcasey/replace-nom-parser-with-libgit2 branch December 13, 2024 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Status: Done
Development

Successfully merging this pull request may close these issues.

explore replacing plugins/git nom parser with libgit2 function calls
2 participants