Skip to content

Commit

Permalink
Merge pull request #1279: Fix performance regression in LibGit2RepoIn…
Browse files Browse the repository at this point in the history
…voker

Today, we noticed a performance regression from 1.0.19074.2 to 1.0.19130.1 (M150).

In that release, we changed the way we allocate and refresh LibGit2Repo objects. In order to allow packfile deletions, we need to drop the handles that LibGit2 keeps open during the duration of the LibGit2Repo. The change was to share one repo object and to dispose of it after 15 minutes of inactivity. However, that doesn't take into account the startup time required to initialize the data. This can be a lot worse if the user has thousands of pack-files.

To fix this, change the model. With this change, the PackfileMaintenanceStep will request that the LibGit2Repos are closed before performing the expire command, and then will reinitialize the LibGit2 repo right after. This will keep the "warmup" time as part of the initial mount or as the background operation.

If the read-object hook or a file hydration comes in while the expire command is running, a LibGit2 repo will be initialized and it _may_ cause some .idx files to stick around until the next run. This was already the case, but then the warmup would happen during those operations. This will happen at most once a day.

> 1. How did you find this was the issue?

@jrbriggs gets all the credit for realizing that the performance difference between versions only happened in calls where one used the read-object hook. If the read-object hook wasn't called, then there was no difference in the performance numbers. But there was a significant slowdown if the read-object hook was called, and could even be several seconds for a single object.

From that, I started [spelunking the diff](v1.0.19074.1...v1.0.19130.1) and looked specifically at what changed around the logic in `InProcessMount` for downloading an object. There were two candidates:

i. The interaction with the Git credential manager had a lot of changes. I checked with @jamill about his expectations of performance numbers there, even if it is misbehaving. He and @jeschu1 thought there was no reason for more than 100ms of delay in any of that code, and other side-effects would be very visible if it required more server connections. This led me to focus hard on item (ii).

ii. The `LibGit2RepoInvoker` was updated in this release to no longer create a large pool of repos at mount time and only dispose of them at process end. We had shipped this independently of the packfile maintenance for this very reason of being safe, but I had anticipated any problems would be in the multi-threaded access of a single libgit2 repo. The multi-threaded acceess has not been a problem, as far as I can tell.

> 2. How did you test the performance of this change?

To measure this, I made a change like in [this commit](derrickstolee@5bf0acd) to use a `Stopwatch` around the two operations. This would easily point to which of the two options above were likely the cause. After installing that version locally and running `git grep foo` to trigger loose object downloads, I saw the timing for the first read-object hook had *2.5 seconds* in the `TryGetIsBlob()` method, and all subsequent calls were ~2 milliseconds. This was on my local repo where I had inflated the pack dir to ~800 packs. If a user has 7,000+ then libgit2 will take even longer to prep the packfile data structures.

This performance testing also led me to double-check this PR and notice that the first run was still slow. The trick was to add the commit "LibGit2RepoInvoker: Test a real SHA1 when initializing" because using an empty SHA-1 seems to short-cut somewhere in the logic and not actually load the object store data in the LibGit2Repo. With that change, I see all of my `TryGetIsBlob()` calls taking 1-2 milliseconds again, even the first one after a `PackfileMaintenanceStep`.

Resolves #1277
  • Loading branch information
derrickstolee authored Jun 17, 2019
2 parents 3ea4393 + d1054d3 commit 139600c
Show file tree
Hide file tree
Showing 6 changed files with 126 additions and 91 deletions.
12 changes: 10 additions & 2 deletions GVFS/GVFS.Common/Git/GitRepo.cs
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,22 @@ private enum LooseBlobState
Unknown,
}

public bool HasActiveLibGit2Repo => this.libgit2RepoInvoker?.IsActive == true;

public GVFSLock GVFSLock
{
get;
private set;
}

public void CloseActiveRepo()
{
this.libgit2RepoInvoker?.DisposeSharedRepo();
}

public void OpenRepo()
{
this.libgit2RepoInvoker?.InitializeSharedRepo();
}

public bool TryGetIsBlob(string sha, out bool isBlob)
{
return this.libgit2RepoInvoker.TryInvoke(repo => repo.IsBlob(sha), out isBlob);
Expand Down
65 changes: 25 additions & 40 deletions GVFS/GVFS.Common/Git/LibGit2RepoInvoker.cs
Original file line number Diff line number Diff line change
Expand Up @@ -6,49 +6,27 @@ namespace GVFS.Common.Git
{
public class LibGit2RepoInvoker : IDisposable
{
private static readonly TimeSpan DefaultRepositoryDisposalPeriod = TimeSpan.FromMinutes(15);

private readonly Func<LibGit2Repo> createRepo;
private readonly ITracer tracer;
private readonly object sharedRepoLock = new object();
private readonly TimeSpan sharedRepositoryDisposalPeriod;
private volatile bool disposing;
private volatile int activeCallers;
private LibGit2Repo sharedRepo;
private Timer sharedRepoDisposalTimer;

public LibGit2RepoInvoker(ITracer tracer, Func<LibGit2Repo> createRepo, TimeSpan? disposalPeriod = null)
public LibGit2RepoInvoker(ITracer tracer, Func<LibGit2Repo> createRepo)
{
this.tracer = tracer;
this.createRepo = createRepo;

if (!disposalPeriod.HasValue || disposalPeriod.Value <= TimeSpan.Zero)
{
this.sharedRepositoryDisposalPeriod = DefaultRepositoryDisposalPeriod;
}
else
{
this.sharedRepositoryDisposalPeriod = disposalPeriod.Value;
}

this.sharedRepoDisposalTimer = new Timer(
(state) => this.DisposeSharedRepo(),
state: null,
dueTime: this.sharedRepositoryDisposalPeriod,
period: this.sharedRepositoryDisposalPeriod);
this.InitializeSharedRepo();
}

public bool IsActive => this.sharedRepo != null;

public void Dispose()
{
this.disposing = true;

lock (this.sharedRepoLock)
{
this.sharedRepoDisposalTimer?.Dispose();
this.sharedRepoDisposalTimer = null;

this.sharedRepo?.Dispose();
this.sharedRepo = null;
}
Expand Down Expand Up @@ -81,37 +59,44 @@ public bool TryInvoke<TResult>(Func<LibGit2Repo, TResult> function, out TResult
}
}

private LibGit2Repo GetSharedRepo()
public void DisposeSharedRepo()
{
lock (this.sharedRepoLock)
{
if (this.disposing)
{
return null;
}

this.sharedRepoDisposalTimer?.Change(this.sharedRepositoryDisposalPeriod, this.sharedRepositoryDisposalPeriod);

if (this.sharedRepo == null)
if (this.disposing || this.activeCallers > 0)
{
this.sharedRepo = this.createRepo();
return;
}

return this.sharedRepo;
this.sharedRepo?.Dispose();
this.sharedRepo = null;
}
}

private void DisposeSharedRepo()
public void InitializeSharedRepo()
{
// Run a test on the shared repo to ensure the object store
// is loaded, as that is what takes a long time with many packs.
// Using a potentially-real object id is important, as the empty
// SHA will stop early instead of loading the object store.
this.GetSharedRepo()?.ObjectExists("30380be3963a75e4a34e10726795d644659e1129");
}

private LibGit2Repo GetSharedRepo()
{
lock (this.sharedRepoLock)
{
if (this.disposing || this.activeCallers > 0)
if (this.disposing)
{
return;
return null;
}

this.sharedRepo?.Dispose();
this.sharedRepo = null;
if (this.sharedRepo == null)
{
this.sharedRepo = this.createRepo();
}

return this.sharedRepo;
}
}
}
Expand Down
23 changes: 11 additions & 12 deletions GVFS/GVFS.Common/Maintenance/PackfileMaintenanceStep.cs
Original file line number Diff line number Diff line change
Expand Up @@ -128,18 +128,6 @@ protected override void PerformMaintenance()
activity.RelatedWarning($"Skipping {nameof(PackfileMaintenanceStep)} due to git pids {string.Join(",", processIds)}", Keywords.Telemetry);
return;
}

// If a LibGit2Repo is active, then it may hold handles to the .idx and .pack files we want
// to delete during the 'git multi-pack-index expire' step. If one starts during the step,
// then it can still block those deletions, but we will clean them up in the next run. By
// checking HasActiveLibGit2Repo here, we ensure that we do not run twice with the same
// LibGit2Repo active across two calls. A "new" repo should not hold handles to .idx files
// that do not have corresponding .pack files, so we will clean them up in CleanStaleIdxFiles().
if (this.Context.Repository.HasActiveLibGit2Repo)
{
activity.RelatedWarning($"Skipping {nameof(PackfileMaintenanceStep)} due to active libgit2 repo", Keywords.Telemetry);
return;
}
}

this.GetPackFilesInfo(out int beforeCount, out long beforeSize, out bool hasKeep);
Expand All @@ -150,7 +138,18 @@ protected override void PerformMaintenance()
return;
}

// If a LibGit2Repo is active, then it may hold handles to the .idx and .pack files we want
// to delete during the 'git multi-pack-index expire' step. If one starts during the step,
// then it can still block those deletions, but we will clean them up in the next run. By
// running CloseActiveRepos() here, we ensure that we do not run twice with the same
// LibGit2Repo active across two calls. A "new" repo should not hold handles to .idx files
// that do not have corresponding .pack files, so we will clean them up in CleanStaleIdxFiles().
this.Context.Repository.CloseActiveRepo();

GitProcess.Result expireResult = this.RunGitCommand((process) => process.MultiPackIndexExpire(this.Context.Enlistment.GitObjectsRoot), nameof(GitProcess.MultiPackIndexExpire));

this.Context.Repository.OpenRepo();

List<string> staleIdxFiles = this.CleanStaleIdxFiles(out int numDeletionBlocked);
this.GetPackFilesInfo(out int expireCount, out long expireSize, out hasKeep);

Expand Down
7 changes: 7 additions & 0 deletions GVFS/GVFS.Common/Maintenance/PrefetchStep.cs
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,11 @@ private bool TryGetMaxGoodPrefetchTimestamp(out long maxGoodTimestamp, out strin
metadata.Add(TracingConstants.MessageKey.InfoMessage, $"{nameof(this.TryGetMaxGoodPrefetchTimestamp)} deleting bad idx file");
this.Context.Tracer.RelatedEvent(EventLevel.Informational, $"{nameof(this.TryGetMaxGoodPrefetchTimestamp)}_DeleteBadIdx", metadata);

// We need to close the LibGit2 repo data in order to delete .idx files.
// Close inside the loop to only close if necessary, reopen outside the loop
// to minimize initializations.
this.Context.Repository.CloseActiveRepo();

if (!this.Context.FileSystem.TryWaitForDelete(this.Context.Tracer, idxPath, IoFailureRetryDelayMS, MaxDeleteRetries, RetryLoggingThreshold))
{
error = $"Unable to delete {idxPath}";
Expand All @@ -246,6 +251,8 @@ private bool TryGetMaxGoodPrefetchTimestamp(out long maxGoodTimestamp, out strin
return false;
}
}

this.Context.Repository.OpenRepo();
}

error = null;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ public void PrefetchCleansUpBadPrefetchPack()
[TestCase, Order(4)]
public void PrefetchCleansUpOldPrefetchPack()
{
this.Enlistment.UnmountGVFS();

string[] prefetchPacks = this.ReadPrefetchPackFileNames();
long oldestPackTimestamp = this.GetOldestPackTimestamp(prefetchPacks);

Expand Down Expand Up @@ -147,6 +149,8 @@ public void PrefetchCleansUpOldPrefetchPack()
[TestCase, Order(5)]
public void PrefetchFailsWhenItCannotRemoveABadPrefetchPack()
{
this.Enlistment.UnmountGVFS();

string[] prefetchPacks = this.ReadPrefetchPackFileNames();
long mostRecentPackTimestamp = this.GetMostRecentPackTimestamp(prefetchPacks);

Expand Down Expand Up @@ -178,6 +182,8 @@ public void PrefetchFailsWhenItCannotRemoveABadPrefetchPack()
[TestCase, Order(6)]
public void PrefetchFailsWhenItCannotRemoveAPrefetchPackNewerThanBadPrefetchPack()
{
this.Enlistment.UnmountGVFS();

string[] prefetchPacks = this.ReadPrefetchPackFileNames();
long oldestPackTimestamp = this.GetOldestPackTimestamp(prefetchPacks);

Expand Down Expand Up @@ -210,6 +216,8 @@ public void PrefetchFailsWhenItCannotRemoveAPrefetchPackNewerThanBadPrefetchPack
[TestCase, Order(7)]
public void PrefetchFailsWhenItCannotRemoveAPrefetchIdxNewerThanBadPrefetchPack()
{
this.Enlistment.UnmountGVFS();

string[] prefetchPacks = this.ReadPrefetchPackFileNames();
long oldestPackTimestamp = this.GetOldestPackTimestamp(prefetchPacks);

Expand Down Expand Up @@ -246,6 +254,8 @@ public void PrefetchFailsWhenItCannotRemoveAPrefetchIdxNewerThanBadPrefetchPack(
[TestCase, Order(8)]
public void PrefetchCleansUpStaleTempPrefetchPacks()
{
this.Enlistment.UnmountGVFS();

// Create stale packs and idxs in the temp folder
string stalePackContents = "StalePack";
string stalePackPath = Path.Combine(this.TempPackRoot, $"{PrefetchPackPrefix}-123456-{Guid.NewGuid().ToString("N")}.pack");
Expand Down Expand Up @@ -296,18 +306,24 @@ public void PrefetchCleansUpOphanedLockFiles()

this.fileSystem.CreateEmptyFile(graphLockPath);

// Unmount so we can delete the files.
this.Enlistment.UnmountGVFS();

// Force deleting the prefetch packs to make the prefetch non-trivial.
this.fileSystem.DeleteDirectory(this.PackRoot);
this.fileSystem.CreateDirectory(this.PackRoot);
this.fileSystem.CreateEmptyFile(midxLockPath);

// Re-mount so the post-fetch job runs
this.Enlistment.MountGVFS();

this.Enlistment.Prefetch("--commits");
this.PostFetchJobShouldComplete();

this.fileSystem.FileExists(graphLockPath).ShouldBeFalse();
this.fileSystem.FileExists(midxLockPath).ShouldBeFalse();
this.fileSystem.FileExists(graphPath).ShouldBeTrue();
this.fileSystem.FileExists(midxPath).ShouldBeTrue();
this.fileSystem.FileExists(graphLockPath).ShouldBeFalse(nameof(graphLockPath));
this.fileSystem.FileExists(midxLockPath).ShouldBeFalse(nameof(midxLockPath));
this.fileSystem.FileExists(graphPath).ShouldBeTrue(nameof(graphPath));
this.fileSystem.FileExists(midxPath).ShouldBeTrue(nameof(midxPath));
}

private void PackShouldHaveIdxFile(string pathPath)
Expand Down
Loading

0 comments on commit 139600c

Please sign in to comment.