[FEATURE REQUEST] Support for creating nested namespaces recursively #543

Karthilearns · 2024-12-12T15:35:49Z

Is your feature request related to a problem? Please describe.

Currently , Polaris doesn't support creating namespaces which do not exists recursively . for spark.sql('create namespace n1.n2.n3') to work , n1 and n2 should already be there in place , or else the SQL would fail.

To bring in compatibility with other catalog's SQL, would it make sense to support creating nested namespaces recursively? Or is there any reason behind not doing this?

Describe the solution you'd like

allow users to create nested namespaces with a single SQL command,

Describe alternatives you've considered

The only alternative we have is constructing multiple SQL queries which is difficult compared to what other catalog offers.

Additional context

No response

jbonofre · 2024-12-12T15:41:08Z

You are using Spark with the Iceberg REST client I guess right ?

So, I guess you are referencing using foo/bar/john namespace name, we should create the namespaces foo, bar, john recursively (one call), right ?

Karthilearns · 2024-12-12T15:49:22Z

@jbonofre yes , I'm using Spark with Polaris REST catalog.

Yes , this should happen in one call.

Karthilearns · 2024-12-12T15:55:54Z

If this cannot be done or not in Polaris's scope of nested catalogs , I would like to know the reason behind. The one of reason I'm concerned about is,

Say i have my application writing iceberg tables to a REST based catalog. here with Polaris's way of handling nested namespaces , I cannot migrate my application code to an another rest catalog implementation. The migration use case of iceberg catalogs actually fails here.

Karthilearns · 2024-12-13T02:54:57Z

@jbonofre - im happy to contribute a PR incase of feature approval.

flyrain · 2024-12-13T04:23:36Z

Hi @Karthilearns, would you mind sharing the error message?

Karthilearns · 2024-12-13T05:07:03Z

Hi @flyrain ,

session.sql("use quickstart_catalog"); session.sql("show namespaces").show(); session.sql("create namespace toplevel.second");

Error :

Exception in thread "main" org.apache.iceberg.exceptions.NoSuchNamespaceException: Namespace does not exist: toplevel at org.apache.iceberg.rest.ErrorHandlers$NamespaceErrorHandler.accept(ErrorHandlers.java:173) at org.apache.iceberg.rest.ErrorHandlers$NamespaceErrorHandler.accept(ErrorHandlers.java:166) at org.apache.iceberg.rest.HTTPClient.throwFailure(HTTPClient.java:211) at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:323) at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:262) at org.apache.iceberg.rest.HTTPClient.post(HTTPClient.java:368) at org.apache.iceberg.rest.RESTClient.post(RESTClient.java:112) at org.apache.iceberg.rest.RESTSessionCatalog.createNamespace(RESTSessionCatalog.java:538) at org.apache.iceberg.catalog.BaseSessionCatalog$AsCatalog.createNamespace(BaseSessionCatalog.java:128) at org.apache.iceberg.rest.RESTCatalog.createNamespace(RESTCatalog.java:223) at org.apache.iceberg.spark.SparkCatalog.createNamespace(SparkCatalog.java:482) at org.apache.spark.sql.execution.datasources.v2.CreateNamespaceExec.run(CreateNamespaceExec.scala:47) at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43) at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43) at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:691) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:682) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:713) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:744) at com.striim.PolarisCatalog.main(PolarisCatalog.java:17)

flyrain · 2024-12-13T22:38:12Z

It failed at when Polaris is trying to check the parent namespace's privilege, https://github.com/polaris-catalog/polaris/blob/main/service/common/src/main/java/org/apache/polaris/service/catalog/PolarisCatalogHandlerWrapper.java#L248-L248.

Basically, the behavior you want here is to create the parent namespace if it doesn't exist, then create the sub namespace. If Polaris allows this behavior, it actually breaks the assumption that one REST API call only creates one namespace. Checking this spec for details, https://github.com/polaris-catalog/polaris/blob/main/spec/rest-catalog-open-api.yaml#L4080-L4080. I think it's more suitable as a client side change, like this pseudo code shows:

# create namespace n1.n2
if( n1 not exists) {
   create n1
   create n2
} else {
   create n2
}

Karthilearns added the enhancement New feature or request label Dec 12, 2024

github-project-automation bot added this to Basic Kanban Board Dec 12, 2024

Karthilearns changed the title ~~[FEATURE REQUEST] Support for creating namespaces recursively~~ [FEATURE REQUEST] Support for creating nested namespaces recursively Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] Support for creating nested namespaces recursively #543

[FEATURE REQUEST] Support for creating nested namespaces recursively #543

Karthilearns commented Dec 12, 2024

jbonofre commented Dec 12, 2024

Karthilearns commented Dec 12, 2024

Karthilearns commented Dec 12, 2024 •

edited

Loading

Karthilearns commented Dec 13, 2024

flyrain commented Dec 13, 2024

Karthilearns commented Dec 13, 2024 •

edited

Loading

flyrain commented Dec 13, 2024

[FEATURE REQUEST] Support for creating nested namespaces recursively #543

[FEATURE REQUEST] Support for creating nested namespaces recursively #543

Comments

Karthilearns commented Dec 12, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

jbonofre commented Dec 12, 2024

Karthilearns commented Dec 12, 2024

Karthilearns commented Dec 12, 2024 • edited Loading

Karthilearns commented Dec 13, 2024

flyrain commented Dec 13, 2024

Karthilearns commented Dec 13, 2024 • edited Loading

flyrain commented Dec 13, 2024

Karthilearns commented Dec 12, 2024 •

edited

Loading

Karthilearns commented Dec 13, 2024 •

edited

Loading