Make `Fetcher` client function more flexible #278

cmurphy · 2024-10-17T20:51:23Z

In integrating Tessera into the Rekor log personality, we would like to have the Rekor server be able to return an inclusion proof to a client. In Tessera's model, this effectively makes Rekor its own client, since it needs to query its internal storage to fetch the tiles needed for the proof.

Tessera's client.NewProofBuilder constructor requires a function of type Fetcher as input, which takes a path string as a parameter and returns a tile. This makes sense when the tile is stored in a posix-like file storage organized by directory paths. It makes less sense when using MySQL as the storage backend, and is awkward to use when the server has direct access to the MySQL storage provider instance.

Consider the tile fetcher constructor:

trillian-tessera/client/client.go

Lines 272 to 286 in 0c3daf4

    
           return func(ctx context.Context, level, index uint64) (*api.HashTile, error) { 
        
           	p := layout.TilePath(level, index, logSize) 
        
           	t, err := f(ctx, p) 
        
           	if err != nil { 
        
           		if !errors.Is(err, os.ErrNotExist) { 
        
           			return nil, fmt.Errorf("failed to read tile at %q: %w", p, err) 
        
           		} 
        
           		return nil, err 
        
           	} 
        
           	var tile api.HashTile 
        
           	if err := tile.UnmarshalText(t); err != nil { 
        
           		return nil, fmt.Errorf("failed to parse tile: %w", err) 
        
           	} 
        
           	return &tile, nil

It takes the log size, tile level, and entry index to construct the tile path and then calls the fetcher function on that path to retrieve the tile from storage.

When using the MySQL storage instance directly, the path has no meaning. So to construct a proof builder, we have to deconstruct the path, reversing the work of the newTileFetcher:

tileOnlyFetcher := func(ctx context.Context, path string) ([]byte, error) {
        pathParts := strings.SplitN(path, "/", 3)
        level, index, width, err := layout.ParseTileLevelIndexWidth(pathParts[1], pathParts[2])
        if err != nil {
                return nil, err
        }
        return tesseraStorage.ReadTile(ctx, level, index, width)
}
proofBuilder, err := client.NewProofBuilder(ctx, checkpoint, tileOnlyFetcher)

This works, it's just a bit strange and roundabout. It would be nice to be able to construct a proof builder that can work natively with any of the storage providers, or at least doesn't assume that the tile is indexed by its path.

The text was updated successfully, but these errors were encountered:

AlCutter · 2024-10-25T12:19:32Z

Hi @cmurphy,

Thanks for taking the time to write this up!

I think what we'll do near term is have each storage impl also provide an implementation of client.Fetcher for use-cases like this, where the application layer needs to read log data directly. It can do that in whatever is the most idiomatic fashion for the infra.

I understand your point about using the string containing the resource path being used here, and another way to think about it is that this string is essentially the Tessera public read API. The MySQL storage is unusual/unique in that it is the one storage infra which doesn't directly expose a filesystem-like view of that API based on resource path.

What you're suggesting is adding a second semi-public/private read API for application developers because one of the storage types is a bit unusual - all others absolutely do use that resource path as their primary key equivalent. In that framing, it does feel reasonable to me that the burden of complying with that API should fall on the MySQL storage impl.

Of course, we could think about having the client package define more tightly scoped funcs which need to be passed in to the various methods/c'tors there (e.g. going from just client.Fetcher to having client.CheckpointFetcherFn, client.TileFetcherFn, client.EntryBundleFetcherFn, etc.) - that would enable those individual storage fetcher implementations to be defined in terms of a different read API to the public one, but I wonder whether that is worth an increase in "apparent complexity" for client package users in general (I don't really know at the moment).

I did hack up a quick experiment of what such a change might look like here: https://github.com/transparency-dev/trillian-tessera/compare/main...AlCutter:trillian-tessera:client_more_specific_fetch_interfaces_playground?expand=1

Separately, though, I think there would be value in taking a step back to think about what an "ideal" client package API might look like, both from Tessera log application developers and developers of clients of those log applications.

AlCutter · 2024-11-07T11:20:26Z

Hi again,

Thanks for your patience, I'm back after dealing with summit stuff and taking some vacation.

I've been playing around with this quite a bit recently, trying a few things out to see how they feel - I'm warming up to your idea and I've merged #281, and that plus #292 adds add a more fine-grained set of read primitives to the storage contract and updates the client code so that it can directly use them.

How do those changes look from your PoV?

AlCutter mentioned this issue Oct 25, 2024

Storage impls have a common read API #281

Merged

AlCutter self-assigned this Nov 7, 2024

AlCutter mentioned this issue Nov 8, 2024

Add tighter-scoped function types in /client #292

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `Fetcher` client function more flexible #278

Make `Fetcher` client function more flexible #278

cmurphy commented Oct 17, 2024 •

edited

Loading

AlCutter commented Oct 25, 2024

AlCutter commented Nov 7, 2024

Make Fetcher client function more flexible #278

Make Fetcher client function more flexible #278

Comments

cmurphy commented Oct 17, 2024 • edited Loading

AlCutter commented Oct 25, 2024

AlCutter commented Nov 7, 2024

Make `Fetcher` client function more flexible #278

Make `Fetcher` client function more flexible #278

cmurphy commented Oct 17, 2024 •

edited

Loading