-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
list virtual branches performance #4771
Conversation
@Byron is attempting to deploy a commit to the GitButler Team on Vercel. A member of the Team first needs to authorize it. |
This can be useful for those who seek extra information, and while it's dominating refresh times.
It calls `list_virtual_branches()` which kind of has the effect of a status call.
There we don't necessarily need every last bit of optimization like one would want in release profiles, but build performance is of the essence for developer productivity.
The worktree diff is essentially a 'git status' with a given tree, and that can be expressed with a diff of the workdir, the index and a tree. This comes at the expense of not being able to control which file sizes are diffed.
```` GITBUTLER_PERFORMANCE_LOG=1 LOG_LEVEL=debug pnpm tauri dev ```` Additionally, some noise was removed by turning a warning into a trace.
…nches. That way, the head is static which allows to re-use the worktree status across multiple related calls. Also, add a way to quickly get the `head_commit()` from a `git2::Repository`. Note that this may cause breakage in the app, as now it will see the actual head, not a computed one, but that should also help us to find places that didn't properly update their state.
Also make sure that important `gix` crates are always compiled with optimization for proper performance. This is particularly useful when optimizing performance as not everything has to be done in release mode. However, it also causes CI to be slower as it compiles some dependencies longer. The problem realy is that `tauri dev` doesn't support custom profiles, so there is only `dev` and `release`.
It's used for emission of changed files, but also for listing branch information. Make sure we only run the worktree-status once to save time. This also unifies the acquisition of the `HEAD^{commit}`, which one day will be a natural feature once `gix::Repository` is used.
c44d090
to
c86300b
Compare
This was lost previously when switching it over to a read-only implementation. Implementing it with an ignore list will take time, 400ms in the GitLab repository, but it's not slower than it was before and it is always preferred to not dump objects into the ODB unnecessarily.
That way, the application will not run into unexpected cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exciting about cleaning up that part of the code. lets pay close attention for any issues on the nightly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few more notes on this merged PR, @Caleb-T-Owens and @krlvi, just for your information.
) -> Result<DiffByPathMap> { | ||
let repository = context.repository(); | ||
let head_commit = repository.head_commit()?; | ||
gitbutler_diff::workdir(repository, &head_commit.id()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the only place where we drive gitbutler_diff::workdir()
with the (possibly stale) HEAD
commit, and I do wonder if that's always correct.
However, I also believe that the HEAD
integration commit should not be able to go stale in the first place, and would rather want to see that fixed.
It's something that isn't attempted here just yet though.
/// This should be used to update the `gitbutler/workspace` ref with, which is usually | ||
/// done from [`update_gitbutler_integration()`], after any of its input changes. | ||
/// This is namely the conflicting state, or any head of the virtual branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the list of dependencies that I see are used here in the hopes that is raises awareness in functions that alter these inputs.
In theory, these should adjust the integration commit to match, so eventually all read-only code can rely on HEAD
to be valid.
An early PR to capture the research and first steps towards making
list_virtual_branches
faster by introducinggix
.Refactoring might happen here as well.
Tasks
list_virtual_branches
for CLIlist_virtual_branches
by profiling: gitbutler_diff:workdir()diff::workdir()
be implemented without writing objects (but also, without skipping files probably?) ingit2
?diff::workdir()
call when responding to filesystem events (reuse the data instead) - makes it 800ms fasterdiff::workdir()
get_workspace_head()
is required?Notes for the Reviewer
I think I compared debug with release or something. This PR is merely equally fast for responses to filesystem changes, but will write much less to the ODB in the process as it avoidsrecalculate_everything()
is about twice as fast in this PR as it is on Master.index.add_all(*)
.diff::workdir()
is read-only, but a little slower as it now has to find big files to skip first (in order to be mergeable). Overall it's still an improvement, but wantsgix
(in follow-up)Follow-Ups
apply_branch
speedupsgix
for diff tree hunks #5899gix
forget_worktree_head()
- merge-treesmerge-tree
in more places #5416merge_base_octopus()
#5490gix
forcreate_wd_tree
gix-status
forcreate_wd_tree()
#5889Shortcomings of current implementation
index::add(*)
Performance Results
Everything was taken from the GitLab repository, with a
--release
build.workspace big files
Before
731ms with
git2
.After
389ms for traversing 78k files, a 1.88x improvement. It's faster as it won't have to do an actual status, it just finds untracked files. Of course, ideally one day there is no need to exclude untracked large files as untracked files as virtual branches won't 'soak up' the worktree like they do now.
duplicate workdir call
Before
After
Final Result
On
master
, unfortunately, this takes about 1,9s as well, which is largely due toworkdir
now actually being faster there. The overhead of doing theignore_large_files_in_diffs()
call is significant, and it eats up all wins.However, this is the basis
gix
to come in and do a status much faster, which is when everything will be consistently faster once again.It's notable though that technically,
diff::workdir
is now 25% slower at least, and all that so it won't add everything to the worktree.And I am quite aware that there are many locations that call
create_wd_tree()
which does exactly that, but I suppose it has to start somewhere.Performance
list_virual_branches()
on the GitLab repo takes 1.16s, of which 802ms are spent on the workdir diff.worktree::add(*)
is most time-consuming, but it's the dirwalk that really takes time.