Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hivemetastore unable to create hive lock after upgrading from hivemetastore 3.1.3 to 4.0.0 during iceberg operations #11784

Open
3 tasks
mAlf1999 opened this issue Dec 13, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@mAlf1999
Copy link

mAlf1999 commented Dec 13, 2024

Apache Iceberg version

1.6.0

Query engine

Spark

Please describe the bug 🐞

We are currently using iceberg version 1.6.0 and have been successfully using it along with hivemetastore version 3.1.3 with spark as the query engine. Recently, we went about upgrading hivemetastore to version 4.0.0, and since then we have been running into an issue where the hivemetastore is unable to create hivelock.

The logs from the spark job are as follows:
WARN MetastoreLock: Failed to create lock LockRequest(component:[LockComponent(type:EXCLUSIVE, level:TABLE, dbname:test_hive, tablename:my_table, operationType:UNSET)], user:spark, hostname:my-driver, agentInfo:Iceberg-8de7e5ba-3be2-47e3-ae28-8d22fhghbfa6)
org.apache.thrift.TApplicationException: Internal error processing lock
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_lock(ThriftHiveMetastore.java:4678)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.lock(ThriftHiveMetastore.java:4665)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.lock(HiveMetaStoreClient.java:2153)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:169)
at jdk.proxy2/jdk.proxy2.$Proxy54.lock(Unknown Source)
at org.apache.iceberg.hive.MetastoreLock.lambda$createLock$3(MetastoreLock.java:305)
at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:72)
at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:65)
at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122)
at org.apache.iceberg.hive.MetastoreLock.lambda$createLock$4(MetastoreLock.java:305)
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
at org.apache.iceberg.hive.MetastoreLock.createLock(MetastoreLock.java:302)
at org.apache.iceberg.hive.MetastoreLock.acquireLock(MetastoreLock.java:185)
at org.apache.iceberg.hive.MetastoreLock.lock(MetastoreLock.java:146)
at org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:184)
at org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:128)
at org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:412)
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:384)
at org.apache.iceberg.BaseReplacePartitions.commit(BaseReplacePartitions.java:26)
at org.apache.iceberg.spark.source.SparkWrite.commitOperation(SparkWrite.java:233)
at org.apache.iceberg.spark.source.SparkWrite.access$1300(SparkWrite.java:84)
at org.apache.iceberg.spark.source.SparkWrite$DynamicOverwrite.commit(SparkWrite.java:337)
at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2(WriteToDataSourceV2Exec.scala:399)
at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2$(WriteToDataSourceV2Exec.scala:359)
at org.apache.spark.sql.execution.datasources.v2.OverwritePartitionsDynamicExec.writeWithV2(WriteToDataSourceV2Exec.scala:260)
at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run(WriteToDataSourceV2Exec.scala:337)
at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run$(WriteToDataSourceV2Exec.scala:336)
at org.apache.spark.sql.execution.datasources.v2.OverwritePartitionsDynamicExec.run(WriteToDataSourceV2Exec.scala:260)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83)
at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:142)
at org.apache.spark.sql.DataFrameWriterV2.runCommand(DataFrameWriterV2.scala:196)
at org.apache.spark.sql.DataFrameWriterV2.overwritePartitions(DataFrameWriterV2.scala:187)
at ai.myproj.core.sparkbase.table.iceberg.testIcebergWriter.executeWrite(testIcebergWriter.scala:40)
at ai.myproj.core.sparkbase.table.iceberg.testIcebergWriter.overwritePartition(testIcebergWriter.scala:62)
at ai.myproj.testautoparser.autoparserbase.AutoParserBase.writeModel(AutoParserBase.scala:107)
at ai.myproj.testautoparser.autoparserbase.AutoParserBase.writeModel$(AutoParserBase.scala:89)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.writeModelJSONParser.scala:13)
at ai.myproj.testautoparser.autoparserbase.AutoParserBase.execute(AutoParserBase.scala:77)
at ai.myproj.testautoparser.autoparserbase.AutoParserBase.execute$(AutoParserBase.scala:47)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.execute(JSONParser.scala:13)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.execute(JSONParser.scala:13)
at ai.myproj.core.sparkbase.testSparkBaseConfigurable.execute(testSparkBaseConfigurable.scala:26)
at ai.myproj.core.sparkbase.testSparkBaseConfigurable.execute$(testSparkBaseConfigurable.scala:24)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.execute(JSONParser.scala:13)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.execute(JSONParser.scala:13)
at ai.myproj.core.sparkbase.testSparkBase.perform(testSparkBase.scala:18)
at ai.myproj.core.sparkbase.testSparkBase.perform$(testSparkBase.scala:15)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.perform(JSONParser.scala:13)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.perform(JSONParser.scala:13)
at ai.myproj.core.jobbase.testBase.main(testBase.scala:29)
at ai.myproj.core.jobbase.testBase.main$(testBase.scala:27)
at ai.myproj.testautoparser.parserimpl.json.JSONParser$.main(JSONParser.scala:13)
at ai.myproj.testautoparser.parserimpl.json.JSONParser.main(JSONParser.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

The following is the corresponding error from hivemetastore

ERROR [Metastore-Handler-Pool: Thread-22] metastore.RetryingHMSHandler: MetaException(message:Failed to execute function: enqueueLock(LockRequest(component:[LockComponent(type:EXCLUSIVE, level:TABLE, dbname:my_hive, tablename:my_table, operationType:UNSET)], user:spark, hostname:my-driver, agentInfo::Iceberg-8de7e5ba-3be2-47e3-ae28-8d22fhghbfa6)), details:Incorrect result size: expected 1, actual 16)
at org.apache.hadoop.hive.metastore.txn.retry.SqlRetryHandler.executeWithRetryInternal(SqlRetryHandler.java:155)
at org.apache.hadoop.hive.metastore.txn.retry.SqlRetryHandler.executeWithRetry(SqlRetryHandler.java:122)
at org.apache.hadoop.hive.metastore.txn.TransactionalRetryProxy.invoke(TransactionalRetryProxy.java:186)
at com.sun.proxy.$Proxy29.enqueueLock(Unknown Source)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:740)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.txn.TransactionalRetryProxy.lambda$invoke$0(TransactionalRetryProxy.java:100)
at org.apache.hadoop.hive.metastore.txn.TransactionalRetryProxy.invoke(TransactionalRetryProxy.java:189)
at com.sun.proxy.$Proxy30.lock(Unknown Source)
at org.apache.hadoop.hive.metastore.HMSHandler.lock(HMSHandler.java:9040)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:91)
at org.apache.hadoop.hive.metastore.AbstractHMSHandlerProxy.invoke(AbstractHMSHandlerProxy.java:82)
at com.sun.proxy.$Proxy28.lock(Unknown Source)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:24425)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:24404)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)

Thanks in advance to all from the community and for taking a look

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time
@mAlf1999 mAlf1999 added the bug Something isn't working label Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant