-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust the "table_exists" behavior in the REST Catalog #1018
Comments
Thanks for looking into this. The behavior you described above seems like the correct one to me.
I don't follow this since the expected value should be Reference |
@ndrluis @kevinjqliu
|
Does it make sense for a method suffixed with "exists" to never return false? As I mentioned, and as it is in the Java implementation, when a 404 status is returned, it should indicate false. However, as @kevinjqliu pointed out, we might receive a 404 when the HEAD method is not implemented. I believe we should expect the Specification to be implemented as defined, and if it's not, it's a bug on the catalog side and not on ours. Therefore, I believe we can follow the Java implementation in the sense that it returns false when a 404 status is encountered, but we should maintain the use of the HEAD call because it is expected to exist according to the Catalog spec. @kevinjqliu @TiansuYu @Fokko @sungwy @HonahX What are your thoughts? |
Hi @TiansuYu and @ndrluis - thank you for bringing up this point, and sorry for not getting around to looking at this earlier. Similar to what @TiansuYu suggested, I'm of the opinion that we should do the following:
The reason is, because there's a numerous different factors as to whether a REST endpoint will return a non non 204/404 response, and it would be erroneous for us to return false on any other status code. If users are relying on this endpoint to return affirmatively say that the table exists or not for their use case, and if the REST catalog returns a 5xx error due to an unknown reason, having that be interpreted as False is error-prone |
It looks like @ndrluis raised this issue, and has it assigned this to himself. Maybe we could help review his code :) |
Thanks everyone for bringing this up. Long term, I think we should leverage the endpoint discovery to see if the server provides the capability of table-exists: https://lists.apache.org/thread/8h86382omdx9cmvc15m2bf361p5rz4rk Since the table-exists is in there from the beginning, I think we should follow the spec, and assume that:
The HEAD operation does not require the REST Catalog to fetch the metadata, which will consume less resources on the server, but probably also return much faster for the client. I think need to update the Java implementation to use the HEAD request, this will also uncover the issue on the Java side. For validation, hopefully, we can check this properly using apache/iceberg#10908 once that is in. |
I checked the Java side, and I could not find the HEAD request. For raising visibility, I created an issue there: apache/iceberg#10993 |
Apache Iceberg version
0.7.0 (latest release)
Please describe the bug 🐞
Currently, we return True when the status code is 200 or 204, and False for all other status codes. According to the REST specification, we should return False when the catalog returns a 404 status code and raise an error for other status codes.
In the Java implementation, a try/catch block is used with load_table, and when it catches a NoSuchTableError, it currently returns False.
I believe that this behavior is suppressing errors and looks like it's a bug as reported in #1006
@kevinjqliu I noticed that you are looking into the Issue on the Polaris side and I believe that we need to adjust this on our side.
The text was updated successfully, but these errors were encountered: