-
Notifications
You must be signed in to change notification settings - Fork 772
Help minimize export dumps differences when inspecting --dry-run results #657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The change looks reasonable, but the commit message doesn't look right to me.
Yes, we apparently have a signed commit in our history, and rewriting will strip that signature.
git-filter-repo doesn't write tree objects directly[*]; it has git-fast-import do that. git-fast-import states that the order of the filemodify directives given to it does not matter, and in fact I can find in the code where it is doing its own sorting. Additionally, I tried to duplicate what you stated here and have no idea how you generated the above diff. Digging further myself, I can't find any difference. For example: $ mkdir tmp which shows that every single tree in both the original history and the rewritten history match.
Fair enough.
Can you provide more details about the exact commands you are running and the differences you see? I don't see any tree differences.
No it doesn't; the signed commit in the history is stripped so the HEAD commit id and all commits going back to 2019 are different. |
Hi, First, sorry for the confusion, I'm not sure how I got to conclude the repos were identical yesterday, maybe I ended up working on an already rewritten repository at a point, or forgot to drop a Speaking of
So this patch would only help minimize export dumps differences when inspecting --dry-run results. Indeed, the last matching commit is b9c6254 and the next 9cf87ae has a signature, which I had missed. |
Diffing the fast-export dumps with `--dry-run`, it appears git-filter-repo is using Python's default sorting algorithm for Git trees, rather than the Git-specific algorithm, which is described a bit at: https://stackoverflow.com/questions/78981525/how-git-sort-entries-in-trees ``` $ diff -u .git/filter-repo/fast-export.original .git/filter-repo/fast-export.filtered ... @@ -1451,25 +1329,23 @@ D testcases/expected/case1-twenty D testcases/inputs/case1 M 100755 0a13abf13c87a13ae56c1fc664baa8b44bdac1de testcases/t9390-repo-filter.sh +M 100644 de3799fa8cd84a9ac56fad023ddf3da44eb444bb testcases/t9390/case1 M 100644 e0c88454095bd887852791401a3b2336ad47d012 testcases/t9390/case1-filename M 100644 a1aa78ff11f3a40cafdcd9b531e8c1342e8a75e5 testcases/t9390/case1-ten M 100644 488cbd9d101b107809611940d1cd00a8f72e1b17 testcases/t9390/case1-twenty -M 100644 de3799fa8cd84a9ac56fad023ddf3da44eb444bb testcases/t9390/case1 ``` Unlike described above though, it appears the fast-export dumps treats all files as directory when sorting them (`case1` is a file in the example above), so appending "/" to all entries maintains the original tree order. Note: git-filter-repo doesn't write tree objects directly; it has git-fast-import do that. git-fast-import states that the order of the filemodify directives given to it does not matter. So this doesn't change the resulting Git repository.
A no-op test rewrite of the git-filter-repo repo, using
./git-filter-repo --proceed
, unexpectedly results in a different repo.Diffing the fast-export dumps with
--dry-run
, it appears git-filter-repo is using Python's default sorting algorithm for Git trees, rather than the Git-specific algorithm, which is described a bit at: https://stackoverflow.com/questions/78981525/how-git-sort-entries-in-treesUnlike described above though, it appears the fast-export dumps treats all files as directory when sorting them (
case1
is a file in the example above), so appending "/" to all entries maintains the original tree order.This avoids unnecessary tree hash variation and early, cascading commit id changes, which shouldn't happen before a commit is effectively rewritten by git-filter-repo.
With this change,
./git-filter-repo --proceed
is stable and produces an identical repository, with the same HEAD commit id.