Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAULT-31907: Entity loading speedup #29326

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

miagilepner
Copy link
Contributor

Description

This PR has 3 performance improvements in the loadEntities function:

  1. Batch memDB operations into groups of 1024. Previously, we were creating a new transaction per entity.
  2. Cache the bucket reads from the local alias storage packer and use those when loading entities
  3. Deserialize the entities concurrently

I ran benchmarks using:

  • 200k total entities
  • Each entity had 1 local alias and 1 non-local alias

I got the number 1024 by batching entities together across buckets (which this branch does not implement, but I used for benchmarking so that I wouldn't need to create millions of entities). These batches are counted by adding the number of entities and the number of aliases per entity (note: this is different than what I posted on slack earlier today. I was incorrectly only counting the number of entities per batch).

Benchmark_loadEntities/tx_size_2048-10         	       5	22934412175 ns/op
Benchmark_loadEntities/tx_size_1024-10         	       5	6821009883 ns/op
Benchmark_loadEntities/tx_size_512-10          	       5	6938667308 ns/op
Benchmark_loadEntities/tx_size_256-10          	       5	6962929416 ns/op
Benchmark_loadEntities/tx_size_1-10            	       5	10874873183 ns/op

I also benchmarked the other 2 performance improvements; cache_aliases_{true|false} being optimization number 2, and unmarshal_parallel_{true|false} being optimization number 3.

The results of the benchmarks are:

Benchmark_loadEntities/tx_size_1024_cache_aliases_true_unmarshal_parallel_true_-10         	       3	5570892945 ns/op
Benchmark_loadEntities/tx_size_1024_cache_aliases_true_unmarshal_parallel_false_-10        	       3	6047840347 ns/op
Benchmark_loadEntities/tx_size_1024_cache_aliases_false_unmarshal_parallel_true_-10        	       3	5858072153 ns/op
Benchmark_loadEntities/tx_size_1024_cache_aliases_false_unmarshal_parallel_false_-10       	       3	6072485403 ns/op
Benchmark_loadEntities/tx_size_1_cache_aliases_true_unmarshal_parallel_true_-10            	       3	9174082680 ns/op
Benchmark_loadEntities/tx_size_1_cache_aliases_true_unmarshal_parallel_false_-10           	       3	9550873292 ns/op
Benchmark_loadEntities/tx_size_1_cache_aliases_false_unmarshal_parallel_true_-10           	       3	9403697055 ns/op
Benchmark_loadEntities/tx_size_1_cache_aliases_false_unmarshal_parallel_false_-10          	       3	9821321805 ns/op

TODO only if you're a HashiCorp employee

  • Backport Labels: If this fix needs to be backported, use the appropriate backport/ label that matches the desired release branch. Note that in the CE repo, the latest release branch will look like backport/x.x.x, but older release branches will be backport/ent/x.x.x+ent.
    • LTS: If this fixes a critical security vulnerability or severity 1 bug, it will also need to be backported to the current LTS versions of Vault. To ensure this, use all available enterprise labels.
  • ENT Breakage: If this PR either 1) removes a public function OR 2) changes the signature
    of a public function, even if that change is in a CE file, double check that
    applying the patch for this PR to the ENT repo and running tests doesn't
    break any tests. Sometimes ENT only tests rely on public functions in CE
    files.
  • Jira: If this change has an associated Jira, it's referenced either
    in the PR description, commit message, or branch name.
  • RFC: If this change has an associated RFC, please link it in the description.
  • ENT PR: If this change has an associated ENT PR, please link it in the
    description. Also, make sure the changelog is in this PR, not in your ENT PR.

@miagilepner miagilepner added this to the 1.19.0-rc milestone Jan 9, 2025
@github-actions github-actions bot added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jan 9, 2025
Copy link

github-actions bot commented Jan 9, 2025

CI Results:
All Go tests succeeded! ✅

@miagilepner miagilepner force-pushed the miagilepner/VAULT-31907-identity-loading-speedup branch from b45d134 to 83f2147 Compare January 9, 2025 16:43
@miagilepner miagilepner marked this pull request as ready for review January 9, 2025 16:43
@miagilepner miagilepner requested a review from a team as a code owner January 9, 2025 16:43
Copy link

github-actions bot commented Jan 9, 2025

Build Results:
All builds succeeded! ✅

@@ -0,0 +1,3 @@
```release-note:improvement
core/identity: Improve performance of loading entities when unsealing by batching updates, caching local alias storage reads, and doing more work in parallel.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great clarity in the changelog entry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

kubawi
kubawi previously approved these changes Jan 10, 2025
Copy link
Contributor

@kubawi kubawi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -497,22 +509,28 @@ LOOP:
}
}

localAliases, err := i.parseLocalAliases(entity.ID)
err = i.loadLocalAliasesForEntity(ctx, entity, localAliasBuckets)
if err != nil {
Copy link
Contributor

@mpalmi mpalmi Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 do we need to abort the transaction if we encounter an error? (here and elsewhere)

@miagilepner miagilepner force-pushed the miagilepner/VAULT-31907-identity-loading-speedup branch from 83f2147 to 73e900e Compare January 15, 2025 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants