Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Fetcher client function more flexible #278

Open
cmurphy opened this issue Oct 17, 2024 · 2 comments
Open

Make Fetcher client function more flexible #278

cmurphy opened this issue Oct 17, 2024 · 2 comments
Assignees

Comments

@cmurphy
Copy link
Contributor

cmurphy commented Oct 17, 2024

In integrating Tessera into the Rekor log personality, we would like to have the Rekor server be able to return an inclusion proof to a client. In Tessera's model, this effectively makes Rekor its own client, since it needs to query its internal storage to fetch the tiles needed for the proof.

Tessera's client.NewProofBuilder constructor requires a function of type Fetcher as input, which takes a path string as a parameter and returns a tile. This makes sense when the tile is stored in a posix-like file storage organized by directory paths. It makes less sense when using MySQL as the storage backend, and is awkward to use when the server has direct access to the MySQL storage provider instance.

Consider the tile fetcher constructor:

return func(ctx context.Context, level, index uint64) (*api.HashTile, error) {
p := layout.TilePath(level, index, logSize)
t, err := f(ctx, p)
if err != nil {
if !errors.Is(err, os.ErrNotExist) {
return nil, fmt.Errorf("failed to read tile at %q: %w", p, err)
}
return nil, err
}
var tile api.HashTile
if err := tile.UnmarshalText(t); err != nil {
return nil, fmt.Errorf("failed to parse tile: %w", err)
}
return &tile, nil

It takes the log size, tile level, and entry index to construct the tile path and then calls the fetcher function on that path to retrieve the tile from storage.

When using the MySQL storage instance directly, the path has no meaning. So to construct a proof builder, we have to deconstruct the path, reversing the work of the newTileFetcher:

tileOnlyFetcher := func(ctx context.Context, path string) ([]byte, error) {
        pathParts := strings.SplitN(path, "/", 3)
        level, index, width, err := layout.ParseTileLevelIndexWidth(pathParts[1], pathParts[2])
        if err != nil {
                return nil, err
        }
        return tesseraStorage.ReadTile(ctx, level, index, width)
}
proofBuilder, err := client.NewProofBuilder(ctx, checkpoint, tileOnlyFetcher)

This works, it's just a bit strange and roundabout. It would be nice to be able to construct a proof builder that can work natively with any of the storage providers, or at least doesn't assume that the tile is indexed by its path.

@AlCutter
Copy link
Collaborator

Hi @cmurphy,

Thanks for taking the time to write this up!

I think what we'll do near term is have each storage impl also provide an implementation of client.Fetcher for use-cases like this, where the application layer needs to read log data directly. It can do that in whatever is the most idiomatic fashion for the infra.

I understand your point about using the string containing the resource path being used here, and another way to think about it is that this string is essentially the Tessera public read API. The MySQL storage is unusual/unique in that it is the one storage infra which doesn't directly expose a filesystem-like view of that API based on resource path.

What you're suggesting is adding a second semi-public/private read API for application developers because one of the storage types is a bit unusual - all others absolutely do use that resource path as their primary key equivalent. In that framing, it does feel reasonable to me that the burden of complying with that API should fall on the MySQL storage impl.

Of course, we could think about having the client package define more tightly scoped funcs which need to be passed in to the various methods/c'tors there (e.g. going from just client.Fetcher to having client.CheckpointFetcherFn, client.TileFetcherFn, client.EntryBundleFetcherFn, etc.) - that would enable those individual storage fetcher implementations to be defined in terms of a different read API to the public one, but I wonder whether that is worth an increase in "apparent complexity" for client package users in general (I don't really know at the moment).

I did hack up a quick experiment of what such a change might look like here: https://github.com/transparency-dev/trillian-tessera/compare/main...AlCutter:trillian-tessera:client_more_specific_fetch_interfaces_playground?expand=1

Separately, though, I think there would be value in taking a step back to think about what an "ideal" client package API might look like, both from Tessera log application developers and developers of clients of those log applications.

@AlCutter
Copy link
Collaborator

AlCutter commented Nov 7, 2024

Hi again,

Thanks for your patience, I'm back after dealing with summit stuff and taking some vacation.

I've been playing around with this quite a bit recently, trying a few things out to see how they feel - I'm warming up to your idea and I've merged #281, and that plus #292 adds add a more fine-grained set of read primitives to the storage contract and updates the client code so that it can directly use them.

How do those changes look from your PoV?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants