-
-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: explore gix APIs, experiment with gix-blame API #1453
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see this PR and I am looking forward to witnessing it evolve over time :)!
git commit -q --allow-empty -m c5 | ||
git tag at-c5 | ||
git merge branch1 -m m1b1 | ||
echo "line 2" >> file.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd do the same - start extremely simple, maybe do the first rough implementation so it passes this test, and then think of some tougher cases to throw at it, validating that they still come out right.
Maybe it's worth investing into a baseline test which parses the output of git blame
to get the expected output, which then has to be matched by the algorithm. These are typically the second stage as they are more obscure, but make it easier to get correct expectations for a lot of possibly complex inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the input, sounds like a great plan!
@Byron I think I’ve now gotten to a point where I’ve learned enough about My goal now is to write a test that runs a complete blame for the sample repo that is part of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, you already dug deep into the most complex APIs gitoxide
has to offer, I am pretty sure it can't get much worse than that. Having it, however, helps to know what data is needed which will help tremendously with defining the API later.
I think continuing with the test to try and complete the algorithm would be good, even though I'd probably start with refactoring the 'setup' portion of the test a little more to be able to focus on the blame algorithm.
But that's details, I think you are on the right track and this PR already does everything that's needed to do a blame.
Once the core algorithm is proven, as most naive and simple version of course, I'd probably create a baseline of some samples with git
itself, parse the result and use it as expectation for the algorithm created here.
I hope that helps.
|
||
use gix_diff::blob::intern::Token; | ||
use gix_hash::ObjectId; | ||
use gix_odb::pack::FindExt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd want to use the equally named gix_object::FindExt
, it's easier to use.
Thanks, that’s very helpful already! I think I wanted to mostly make sure I’m still on the right track and not missing anything. I’ll continue working on the test, then, and worry about the public API later. |
Yes, I think it's great to keep up the explorative approach using |
This is not a material change as the loop gets only executed once. It is mostly to keep indentation changes in a separate commit.
@Byron This is a quick status update! The algorithm is now able to generate a blame for the sample repo. I extended the test to compare this blame against |
There’s an exit condition `parent_ids.is_empty()` which is sufficient for the first test case.
Otherwise, diffs with more than one hunk would yield incorrect results.
This required merging the separate setup section into the loop, in order to be able to use `continue` when a diff did not contain changes for the path currently being blamed.
@Byron Sooner than expected, there seems to have been quite a bit of progress! I extended the test repo so that it now also contains commits that don’t touch the file being blamed. I also added a file that contains multiline hunks. There’s now 2 test cases that cover one file each. Also, most of the logic is now inside the |
That's great news!
That's interesting - I just know that Git usually ignores merge-commits, but I don't know if it includes them in blames nonetheless. I never thought about how that would work, and would probably see what Git would do. It's something I do in general, that is, try to understand what Git can do and ask if this would (eventually) be possible with the Besides that, I definitely recommend to keep clippy happy and run it locally at least occasionally - ideally CI can stay green even during early development, and I found it worth it as it typically leads to cleaner code. Often I find myself doing a push&run, just to return one more time to quickly fix CI if it breaks so I can continue 'clean' in the morning :D. |
Add assertions related to offset
The addition might match subsequent unblamed hunks.
The unchanged lines might match subsequent unblamed hunks.
At this point, this is not strictly necessary as the changes to `offset_in_destination` are not read anywhere yet after they are changed in the added branches.
The deletion might match subsequent unblamed hunks.
There might be more changes that are supposed to affect the unblamed hunk. The deletion might, for example, be followed by another deletion or by an addition both of which would have an effect on the offset that needs to be applied to the unblamed hunk.
This is a draft PR that is supposed to help me get familiar with
gix
’s APIs as well as to experiment with possible APIs forgix-blame
. Any feedback that helps me not go down the wrong path, is much appreciated! :-)