-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catalogd metas
https endpoint proposal
#1749
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Jordan Keister <[email protected]>
c263327
to
27cfe0c
Compare
redhat-marketplace-index: 9MB | ||
--> | ||
|
||
Requiring clients to retrieve the full catalog can also result in a 4-10 second delay _per catalog_ even under optimal network conditions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the surface, 4-10s
seems like a suspiciously long time to process 21MB of data. Are we talking about in-cluster clients here or clients outside the cluster that may have slower connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, does this include client processing of the data, or simply clients downloading raw bytes as quickly as possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that timeframe the POC performs:
- download of the full FBC
- decomposition of JSONL objects to JSON objects (which I've been told is a pain)
- loading essential objects to the content delivery framework to fulfill catalog package listing
- rendering the catalog listing in the web UI
I'd have to get breakdown details from @TheRealJon to be more specific. For now, I've backed off the language so it doesn't read as specific condemnation of download speeds.
|
||
## Proposal | ||
|
||
This proposal introduces an additional HTTPS endpoint to an existing catalogd API. The existing HTTPS "all" endpoint will remain as a default option; the user will be able to enable this new capability via a feature gate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we talk about how we plan to deprecate "all" once the new endpoint is GA?
} | ||
``` | ||
Query parameters will be logically ANDed and used to restrict response scope. | ||
This API will be conditionally enabled by an upstream `APIV1QueryEndpoint` feature gate as part of a downstream `NewOLM{suffix}` style OCP TP feature gate, and will be disabled by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API will be conditionally enabled by an upstream `APIV1QueryEndpoint` feature gate as part of a downstream `NewOLM{suffix}` style OCP TP feature gate, and will be disabled by default. | |
This API will be conditionally enabled by an upstream `APIV1QueryEndpoint` feature gate as part of a downstream `NewOLM{suffix}` style OCP feature gate. |
We can talk more about TP vs GA in the graduation criteria section.
|
||
This option would require clients to query the entirety of the data (~21 MB for operatorhubio catalog) and parse the response to retrieve relevant information every time the client needs the data. Even if clients’ implement some form of caching, the first query the client does to catalogd server is still the dealbreaker. In a highly resource constrained environment (e.g. clusters in Edge devices), this basically translates to a chokepoint for the clients to get started. | ||
|
||
- A “path hierarchy” based construction of API endpoints to expose filtered FBC metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are worried about the fact that query endpoint responses will almost always be incomplete, a middle ground might be an endpoint that returns all of the FBC metadata for a specific package, but I'm not sure that endpoint would provide the necessary latency requirements we're shooting for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comes down to whether we require the new endpoint to always provide valid FBC.
When we start revising FBC schemas I think we're going to have to juggle this.
For example, if we revise olm.package.v2
which uses its package
field self-referentially, then we also get package-scoped valid FBC without a change to this endpoint.
But how does a client request this? Does it have to request the v2 schema specifically?
b478376
to
f1d014c
Compare
metas
https endpoint proposal
1e97427
to
41f501b
Compare
Signed-off-by: Jordan Keister <[email protected]>
41f501b
to
49b6735
Compare
@grokspawn: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
||
### Non-Goals | ||
|
||
* Redesigning FBC schema to facilitate additional efficiencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very curious what @spadgett and @TheRealJon think about whether this enhancement, on its own (i.e. without further FBC schema evolution), will be a significant enough improvement for Console's use cases, that it is reasonable to keep schema evolution out-of-scope.
I ask because we may need to include the schema evoluation in scope in order to reference it in a graduation criteria for taking the combined OLM and Console changes for OLMv1 support to GA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A fair take. All of this originated from our feature RFC, and it was concerned solely with what our team was going to implement in this iteration.
This doesn't seem aligned with this process' scope maybe, and we can adjust as makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to know for sure whether this will make significant improvements to the network latency issues. We still require almost all of the FBC data in order to populate the catalog view as designed. We hope that making more strategic asynchronous requests will help. The data model issues still remain.
For example, response for `metas?schema=olm.package&name=foo` and `metas?schema=olm.package&name=bar` | ||
represent disjoint sets, and the server would have to either ignore `If-Modified-Since` directives | ||
in client requests or rely on their veracity uncritically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I commented this somewhere, so if this is a repeat comment, I apologize:
RFC 7234, Section 2 states that the cache key is the http method and the full URI, which would include the URL parameters used in the request.
If clients request metas?schema=olm.package&name=foo
and get back a LastModified
response header, they are not supposed to use that value in a subsequent If-Modified-Since
request header when requesting metas?schema=olm.package&name=bar
, because that is a different URI.
redhat-marketplace-index: 9MB | ||
--> | ||
|
||
Requiring OCP console to make the full catalog available to users incurs a 4-10 second delay _per catalog_ even |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, it's more like a 4-10 second delay total, given the size of the 4 default Red Hat catalogs
Requiring OCP console to make the full catalog available to users incurs a 4-10 second delay _per catalog_ even | |
Requiring OCP console to make the full catalog available to users incurs a 4-10 second delay even |
|
||
### Non-Goals | ||
|
||
* Redesigning FBC schema to facilitate additional efficiencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to know for sure whether this will make significant improvements to the network latency issues. We still require almost all of the FBC data in order to populate the catalog view as designed. We hope that making more strategic asynchronous requests will help. The data model issues still remain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @grokspawn. Really appreciate all that you guys are doing to help the console team.
Name string | ||
} | ||
``` | ||
Query parameters will be logically ANDed and used to restrict response scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of the API usage would help here.
Checking my understanding: This API will allow clients to limit the number of items returned, but there's no way to limit what properties are returned for each item, correct? You get the full item representation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spadgett that is correct. You essentially get the list of fbc blobs that match the query criteria, in their entirety.
For example, response for `metas?schema=olm.package&name=foo` and `metas?schema=olm.package&name=bar` | ||
represent disjoint sets, and the server would have to either ignore `If-Modified-Since` directives | ||
in client requests or rely on their veracity uncritically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, @grokspawn. I'm a little unclear what this is saying since earlier the enhancement says cache control headers are supported. Last-Modified
and If-Modified-Since
should work, correct? Is this about keeping a cache on the server for performance?
As @joelanford points out, the browser cache keys on the full URL, including query parameters. Each response for different query parameters is considered different.
The existing `all` endpoint also incentivizes clients to conserve resources via local cache to avoid making | ||
many (potentially duplicate) requests. However, the OCP console proof of concept | ||
required what was deemed an unsupportable amount of code and complexity to cache, decompose, and render the | ||
complete FBC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On duplicate requests: If using cache control headers, this is presumably a quick check if the catalog has been modified and a 304 response. If the catalog has changed, we would want to refresh the cache anyway, so the extra request is actually desirable?
In practice, I think console will either be fetching the entire catalog or just one item. Generally, users work with extensions they've already installed much more often than they'd install a new extension, so we'll usually be getting one item. It's a steep cost to download everything for a single item, even if we only do it once per session. And depending how often the catalog updates, it could be many times per session if we refresh the cache.
I think this is more about performance than code complexity (although complexity is a consideration as well).
Name string | ||
} | ||
``` | ||
Query parameters will be logically ANDed and used to restrict response scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spadgett that is correct. You essentially get the list of fbc blobs that match the query criteria, in their entirety.
|
||
### Caching Considerations | ||
|
||
This proposal indends to provide RFC7234 caching compliance through support for `Last-Modified` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Link to RFC7234 here would be helpful.
|
||
This proposal indends to provide RFC7234 caching compliance through support for `Last-Modified` and | ||
`If-Modified-Since` directives. Clients can use these headers to avoid re-downloading unchanged data. | ||
The server will respond with 304 Not Changed if the catalog metadata is unchanged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It's 304 statusNotModified
--> | ||
> 1. If a query comes in with `/api/v1/metas?package=foo`, should we include the blob with schema: `olm.package` and name: `foo`? | ||
|
||
We feel that it is incorrect for the metas service endpoint to mutate the data model (specifically, to create a synthetic package attribute for the `olm.package` schema). To access all the data modeled for an installable package, separate queries need to be made for the package-level metadata (`schema=olm.package&name=foo`) versus the channel/bundle-level metadata (`package=foo`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section could probably use a little more explanation as to why this is even a question, ie include the context that olm.package
objects do not contain packageName
|
||
#### Completeness | ||
The previous `all` endpoint always returns valid FBC. The new service cannot make that promise, | ||
so clients could make incorrect assumptions about the suitability of results. See Open Questions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a collection of valid FBC blobs not constitute valid FBC
?
No description provided.