Skip to content

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Oct 30, 2025

The REST spec currently uses %1F as the UTF-8 encoded namespace separator for multi-part namespaces.
This causes issues, since it's a control character and the Servlet spec can reject such characters.

This PR makes the hard-coded namespace separator configurable by giving servers an option to send an optional namespace separator instead of %1F. The configuration part is entirely optional for REST server implementers and there's no behavioral change for existing installations.

The actual implementation for this can be seen at #10877

For backward compatibility, empty string is treated as absent for now.
If parent is a multipart namespace, the parts must be separated by the unit separator (`0x1F`) byte.
If parent is a multipart namespace, the parts must be separated by the namespace separator as
indicated via the /config override `namespace-separator`, which defaults to the unit separator (`0x1F`) byte.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we name this conf nested-namespace-separator ?

Copy link
Contributor Author

@nastra nastra Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too strong on the naming and either is fine, so let's see what others think this should be named

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you provide an example inside the comment?

If parent is a multipart namespace, the parts must be separated by the unit separator (`0x1F`) byte.
If parent is a multipart namespace, the parts must be separated by the namespace separator as
indicated via the /config override `namespace-separator`, which defaults to the unit separator (`0x1F`) byte.
To be compatible with older clients, servers must use both the advertised separator and `0x1F` as valid separators when decoding namespaces.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[doubt] I see is then, there no way then 0x1F can be surely considered as a non-seperator ? even when the client supports adhering to advertised seperator and is doing that ?
Or may be if servers wanna support that they can capture X-Iceberg-Version header and see if the client supports this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I don't think I'm following. Could you rephrase your question please?
The reason why the server must use both separators is because an older client will always use 0x1F while a newer client will use the advertised separator

Copy link
Contributor

@singhpk234 singhpk234 Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, my question was if the server knows the client is new i.e respects the configurable seperator (for example client send the iceberg sdk version to server as part of header) can they choose not to treat 0x1F as seperator ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while this would technically be possible with the java client, it would be more challenging with other client implementations as you'd need to keep track of each client version where you know that this is a newer client. Hence I think we should always treat 0x1F as the legacy separator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants