Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: store legacy claims in cache #57

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

volmedo
Copy link
Member

@volmedo volmedo commented Dec 12, 2024

resolves #55

@volmedo volmedo requested a review from alanshaw December 12, 2024 15:55
@volmedo volmedo self-assigned this Dec 12, 2024
Copy link

codecov bot commented Dec 12, 2024

legacyClaims := cfg.legacyClaims
if legacyClaims == nil {
var legacyClaims providerindex.LegacyClaimsFinder
if cfg.legacyClaimsMapper != nil && cfg.legacyClaimsBucket != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check the URL is non-nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the URL is less critical as the service can be built without it ("" might even be a valid value?) so I didn't bother to add it to the check. I can add it if you think it's important though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs a URL with "{claim}" in it otherwise the query processor will error...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an explicit check to fail fast if the URL doesn't contain the placeholder. I felt that silently returning a NotFoundLegacyClaimsFinder if cfg.legacyClaimsUrl == "" could lead to errors difficult to find.

@@ -89,6 +98,10 @@ func (ls LegacyClaimsStore) Find(ctx context.Context, contentHash multihash.Mult
}

results = append(results, pr)

if err := ls.claimsCache.Set(ctx, claimCid, claim, true); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't it stay in the cache forever if we always store after retrieve?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the claims service will later fetch from the cache, I thought refreshing its expiration would be a good idea. IIUC the entry will be there only as long as it is being hit, but will be removed if nobody asks for it in the expiration time. Does that sound correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable...but the point of caching it is to ensure it's super fast on subsequent reads. Here we'd be enforcing a write on read for every request, at some number of requests it's going to be slower on average than letting it fall out of the cache and revalidate. Also if it is removed from the underlying store, and keeps getting refreshed then it will never expire.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! That reasoning makes total sense. I updated the code so that claims are not written to the cache if they were fetched from there.

@hannahhoward
Copy link
Member

@alanshaw @volmedo while caching from the lookup is a good optimization, I do not believe it's a full strategy.

You've got a potential fail path here if:

  1. User looks up legacy content claim first time, record cached in provider index cache + content claims cache
  2. More requests happen, expiring claim from content claims cache but not provider index cache.
  3. User looks up legacy content claim again, pulls provider results from cache without caching the content claim, content claim cannot be looked up cause content claims cache is empty.

Generally, I think the better approach here is to generate wrappable interfaces. I will work on a refactor suggestion.

@hannahhoward
Copy link
Member

suggested refactor #80

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

store legacy claims in the claims cache
3 participants