hive table is ignore caseSensitve, and hive table location just parquet files (schema with upper chars,eg componentId, userName ), after enable blaze, spark sql with upper filter condition won't return any data. #670
Labels
bug
Something isn't working
Describe the bug
hive table is ignore caseSensitve, and hive table location just parquet files (schema with upper chars,eg componentId, userName ), after enable blaze, spark sql with upper filter condition won't return any data.
To Reproduce
Steps to reproduce the behavior:
package scala jar.
spark-submit --class com.***.myapp.Test --master yarn --conf spark.sql.hive.convertMetastoreParquet=true --conf spark.blaze.enable=true --conf spark.sql.extensions=org.apache.spark.sql.blaze.BlazeSparkSessionExtension --conf spark.shuffle.manager=org.apache.spark.sql.execution.blaze.shuffle.BlazeShuffleManager --conf spark.sql.caseSensitive=false cosn://dc-sh-prod-03-1323003688/tasklibs/spark3.2.2_myapp.jar
executor logs:
测试sql :
userGroupInfo.getUserField : dnum
StructType(StructField(dnum,StringType,true), StructField(moneys,IntegerType,false))
+----+------+
|dnum|moneys|
+----+------+
+----+------+
obviusely, it cannt return any data. just filter conditions cause : componentId
自动化分析任务导入的sql :
dataframe schema:
StructType(StructField(dnum,StringType,true), StructField(moneys,IntegerType,false))
+---------+------+
| dnum|moneys|
+---------+------+
|649409512| 3680|
|666687060| 3680|
|667198577| 3680|
|672462560| 3680|
|668511291| 3680|
|661643626| 3680|
|669103964| 3680|
|660927197| 3680|
|671793888| 3680|
|637719401| 3680|
+---------+------+
only showing top 10 rows
append:
A: hive table create scripts :
CREATE EXTERNAL TABLE
report.tb_39e85e2e76e444e195c6db2df728751e_34b7dfe549
(android_id
string,systempid
string,appnm
string,appversion
string,appversioncode
string,biversion
string,cardstyleid
string,city
string,clientdatetime
string,componentcontentid
string,componentid
string,componentname
string,componentposition
string,componenttypeid
string,componentversion
string,datasource
string,dateofweek
string,datetime
string,dayofquarter
string,dayofyear
string,deviceid
string,devicetype
string,dnum
string,hour
string,id
string,imei
string,ip
string,launcherversionname
string,launcherdnum
string,launchervercode
string,mac
string,minute
string,nation
string,networktype
string,packagenm
string,phonetype
string,postconfigversion
string,projectid
string,province
string,region
string,remote_addr
string,scenetemplateid
string,scenetemplatename
string,second
string,sendtime
string,signature
string,systype
string,sysversion
string,systemvercode
string,tabposition
string,tclosversion
string,type
string,userid
string,weekofyear
string,wlanmac
string,xforwarded
string,packagename
string,componentstatus
string,musicstatus
string,componenttitle
string,vid
string,receipttime
string)PARTITIONED BY (
year
bigint,month
bigint,day
bigint,cleanhour
bigint)ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
'hdfs://xxxxxx/data/report/584f9c5bab31fb1d59e138e1/39e85e2e76e444e195c6db2df728751e/34B7DFE549'
B location parquet schema:
The text was updated successfully, but these errors were encountered: