-
Notifications
You must be signed in to change notification settings - Fork 2.9k
REST: Add Support for Custom Operations Builders in RESTCatalog #14465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
c5a8e9a to
d849fce
Compare
| * @param endpoints the set of supported REST endpoints | ||
| * @return a new RESTViewOperations instance | ||
| */ | ||
| default RESTViewOperations createViewOperations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is fileIO handled for view opertions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, FileIO is not required for view operations, because Iceberg views are logical objects that contain only metadata (SQL definitions, schemas, and versions) and do not read or write any physical files.
When a user runs a query against a view, the query engine expands the view's SQL definition, compiles it into a query plan, and resolves the underlying tables. At that point, the engine loads the actual table objects (which include TableOperations and FileIO) to read the physical data files.
|
cc: @flyrain @stevenzwu @huaxingao Could you pls take a look when you get a chance? Thanks! 🙏 |
flyrain
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @XJDKC for the change. Left some comments.
|
|
||
| public RESTCatalog(Function<Map<String, String>, RESTClient> clientBuilder) { | ||
| this(SessionCatalog.SessionContext.createEmpty(), clientBuilder); | ||
| this(SessionCatalog.SessionContext.createEmpty(), clientBuilder, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| this(SessionCatalog.SessionContext.createEmpty(), clientBuilder, null, null); | |
| this(clientBuilder, null, null); |
or
| this(SessionCatalog.SessionContext.createEmpty(), clientBuilder, null, null); | |
| this(SessionCatalog.SessionContext.createEmpty(), clientBuilder); |
We might go with the second one so that no change is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, let me revise it.
| public RESTCatalog( | ||
| Function<Map<String, String>, RESTClient> clientBuilder, | ||
| BiFunction<SessionCatalog.SessionContext, Map<String, String>, FileIO> ioBuilder, | ||
| RESTOperationsBuilder operationsBuilder) { | ||
| this(SessionCatalog.SessionContext.createEmpty(), clientBuilder, ioBuilder, operationsBuilder); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method necessary if we go with this(SessionCatalog.SessionContext.createEmpty(), clientBuilder); in line 68?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though RESTSessionCatalog allows us to pass in the ioBuilder, but the RESTCatalog doesn't, so I add it in the constructor of RESTCatalog as well.
| * RESTSessionCatalog catalog = new RESTSessionCatalog(clientBuilder, ioBuilder, customBuilder); | ||
| * </pre> | ||
| */ | ||
| public interface RESTOperationsBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s more of a factory than a builder. The interface doesn’t progressively build or configure objects. It just creates them directly. The intent and usage align more closely with a Factory or Provider pattern. Should we rename it to xxxFactory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, let me revise it!
| import org.apache.iceberg.util.LocationUtil; | ||
|
|
||
| class RESTTableOperations implements TableOperations { | ||
| public class RESTTableOperations implements TableOperations { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this scope change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users only want to make small adjustments to RESTTableOperations (for example, injecting a custom header), they can simply provide a custom implementation that extends RESTTableOperations, without having to copy the entire class.
This makes it much easier for them to upgrade to newer Iceberg SDK versions without dealing with merge conflicts or duplicated code.
I'm okay with either approach here, don't have a strong preference. WDYT?
| import org.apache.iceberg.view.ViewOperations; | ||
|
|
||
| class RESTViewOperations implements ViewOperations { | ||
| public class RESTViewOperations implements ViewOperations { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this scope change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
Currently, RESTCatalog allows users to replace components such as
RESTClient,FileIO,AuthManager, andMetricsReporter. However, one dependent component that remains non-injectable is RESTTableOperations.This PR adds support for injecting custom implementations of table and view operations in
RESTCatalog, enabling users to extend and customize REST catalog behavior more easily. It doesn't change any functionalities.