Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbWriteTable() and dbAppendTable() produce parsing errors in Spark SQL #544

Closed
ahernnelson opened this issue Feb 15, 2023 · 4 comments
Closed
Labels
bug an unexpected problem or unintended behavior hive

Comments

@ahernnelson
Copy link

ahernnelson commented Feb 15, 2023

Issue Description and Expected Result

dbWriteTable() creates a table but fails to populate. Subsequent calls to dbAppendTable() produce the same error. The same error is in #422.

dbWriteTable(conn, 'an_test', data.frame(x = 1, y = 'a'), overwrite=FALSE, row.names = NULL)
#> Error: nanodbc/nanodbc.cpp:1655: 00000: [Simba][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. Error message from server: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.catalyst.parser.ParseException: 
#> no viable alternative at input '(?'(line 2, pos 8)
#> 
#> == SQL ==
#> INSERT INTO `an_test` (`x`, `y`)
#> VALUES (?, ?)
#> --------^^^
#> 
#>  at org.apac 
#> <SQL> 'INSERT INTO `an_test` (`x`, `y`)
#> VALUES (?, ?)'

Database

Simba Spark 64-bit, Spark SQL Version: 3.1.2

Reproducible Example

library(odbc)
library(DBI)

conn <- dbConnect(
  odbc::odbc(),  
  Driver   = "Simba Spark 64-bit",
  ...,
  ThriftTransport=2,
  UseNativeQuery=1,
  SSL=1)

dbRemoveTable(conn, 'an_test') #case senstistive
dbWriteTable(conn, 'an_test', data.frame(x = 1, y = 'a'), overwrite=FALSE, row.names = NULL)
#> Error: nanodbc/nanodbc.cpp:1655: 00000: [Simba][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. Error message from server: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.catalyst.parser.ParseException: 
#> no viable alternative at input '(?'(line 2, pos 8)
#> 
#> == SQL ==
#> INSERT INTO `an_test` (`x`, `y`)
#> VALUES (?, ?)
#> --------^^^
#> 
#>  at org.apac 
#> <SQL> 'INSERT INTO `an_test` (`x`, `y`)
#> VALUES (?, ?)'

The table exists, but with no rows

dbReadTable(conn, 'an_test')
#> [1] x y
#> <0 rows> (or 0-length row.names)

dbAppendTable() produces the same error as above. Manually inserting with dbExecute() works.

dbExecute(conn, "INSERT INTO `an_test` (`x`, `y`)
VALUES (1, 'a')")
#> [1] 0
dbReadTable(conn, 'an_test')
#>   x y
#> 1 1 a

Attempting to use dbExecute() with sqlAppendTable() produces a potentially related error.

sql_append <- sqlAppendTable(conn, 'an_test', value = data.frame(x=2, y='b'))
dbExecute(conn, sql_append)
#> Error: nanodbc/nanodbc.cpp:1655: 00000: [Simba][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. Error message from server: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: cannot resolve '`b`' given input columns: []; line 4 pos 6;
#> 'InsertIntoStatement 'UnresolvedRelation [an_test], [], false, [x, y], false, false
#> +- ' 
#> <SQL> 'INSERT INTO `an_test`
#>   (`x`, `y`)
#> VALUES
#>   (2, b)'

double quoting the character column seems to work

sql_append <- sqlAppendTable(conn, 'an_test', value = data.frame(x=2, y="'b'"))
dbExecute(conn, sql_append)
#> [1] 0
dbReadTable(conn, 'an_test')
#>   x y
#> 1 2 b
#> 2 1 a
@ahernnelson ahernnelson changed the title dbWriteTable() and dbAppendTable() produce parsing errors in Hive dbWriteTable() and dbAppendTable() produce parsing errors in Spark SQL Feb 15, 2023
@ahernnelson
Copy link
Author

Further digging has led me to believe that the issue is with the prepared statement on line 88 of Table.R which is being executed verbatim and the result is never assigned.

I can't find any support for prepared statements in Spark-odbc, so I'm not sure there is much to do outside of writing my own template or attempting to write to S3, but I am hoping this is not the case.

This seems to be the case in both #500 (Snowflake) #276 (Impala).

@detule
Copy link
Collaborator

detule commented Apr 4, 2023

Hi - i am not a Databricks user but thought I would ask: I am sure you have already tried this, but what happens when connecting without the UseNativeQuery option?

@hadley hadley added hive bug an unexpected problem or unintended behavior labels Apr 24, 2023
@simonpcouch
Copy link
Collaborator

As we haven't heard back from the issue author and related issues have been closed as completed, I'm going to go ahead and close.

@hadley
Copy link
Member

hadley commented Jan 17, 2024

Actually I think the problem is the use of UseNativeQuery=1 — this suppresses the driver's default SQL translation which unfortunately breaks prepared query support. But I don't think there's anything we can do about this unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior hive
Projects
None yet
Development

No branches or pull requests

4 participants