feat: support read and write from hive datasource #100

awang12345 · 2024-08-16T06:20:23Z

What type of PR is this?

feature

What problem(s) does this PR solve?

Issue(s) number:

Description:

Add hive datasource to read and write

How do you solve it?

hive: {
# algo's data source from hive
read: {
#[Optional] spark and hive require configuration on different clusters
metaStoreUris: "thrift://hive-metastore-server-01:9083"
#spark sql
sql: "select column_1,column_2,column_3 from database_01.table_01 "
#[Optional] graph source vid mapping with column of sql result.
srcId: "column_1"
#[Optional] graph dest vid mapping with column of sql result
dstId: "column_2"
#[Optional] graph weight mapping with column of sql result
weight: "column_3"
}

  # algo result sink into hive
  write: {
    #[Optional] spark and hive require configuration on different clusters
    metaStoreUris: "thrift://hive-metastore-server-02:9083"
    #save result to hive table
    dbTableName: "database_02.table_02"
    #[Optional] spark dataframe save mode，optional of Append,Overwrite,ErrorIfExists,Ignore. Default is Overwrite
    saveMode: "Overwrite"
    #[Optional] if auto create hive table. Default is true
    autoCreateTable: true
    #[Optional] algorithm result mapping with hive table column name. Default same with column name of algo result dataframe
    resultTableColumnMapping: {
      # Note: Different algorithms have different output fields, so let's take the pagerank algorithm for example:
      _id: "column_1"
      pagerank: "pagerank_value"
    }
  }
}

Special notes for your reviewer, ex. impact of this fix, design document, etc:

Spark and hive have no environment validation on different clusters. All other cases have been verified

Nicole00 · 2024-08-16T06:28:00Z

nebula-algorithm/src/main/scala/com/vesoft/nebula/algorithm/config/Configs.scala

+    val autoCreateTable: Boolean = getOrElse(config,"hive.write.autoCreateTable",true)
+    //hive元数据地址
+    val writeMetaStoreUris: String = getOrElse(config,"hive.write.metaStoreUris","")
+    //执行结果和表字段映射关系，比如将算法结果中的_id映射为user_id


please update the comment to English~

Nicole00 · 2024-08-16T06:29:02Z

nebula-algorithm/src/main/scala/com/vesoft/nebula/algorithm/reader/DataReader.scala

+      data.repartition(partitionNum)
+    }
+
+    data.show(3)


no need to show.

Nicole00 · 2024-08-16T06:29:29Z

nebula-algorithm/src/main/scala/com/vesoft/nebula/algorithm/writer/AlgoWriter.scala

+    }
+
+    println(s"Save to hive:${config.dbTableName}, saveMode:${saveMode}")
+    _data.show(3)


Nicole00

LGTM

feat: support read and write from hive datasource

eaf7b90

Nicole00 reviewed Aug 16, 2024

View reviewed changes

awang12345 added 2 commits August 16, 2024 18:25

feat: connect hive by meta store

f1a2708

refactor: remove show dataFrame

a1ff1f8

Nicole00 approved these changes Aug 19, 2024

View reviewed changes

Nicole00 merged commit 4accdfe into vesoft-inc:master Aug 19, 2024
2 checks passed

wey-gu mentioned this pull request Aug 24, 2024

Weekly Report 2024-08-23 vesoft-inc/nebula-community#450

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support read and write from hive datasource #100

feat: support read and write from hive datasource #100

Uh oh!

awang12345 commented Aug 16, 2024

Uh oh!

Nicole00 Aug 16, 2024

Uh oh!

Nicole00 Aug 16, 2024

Uh oh!

Nicole00 Aug 16, 2024

Uh oh!

awang12345 Aug 17, 2024

Uh oh!

Nicole00 left a comment

Uh oh!

Uh oh!

Uh oh!

feat: support read and write from hive datasource #100

feat: support read and write from hive datasource #100

Uh oh!

Conversation

awang12345 commented Aug 16, 2024

What type of PR is this?

What problem(s) does this PR solve?

Issue(s) number:

Description:

How do you solve it?

Special notes for your reviewer, ex. impact of this fix, design document, etc:

Uh oh!

Nicole00 Aug 16, 2024

Choose a reason for hiding this comment

Uh oh!

Nicole00 Aug 16, 2024

Choose a reason for hiding this comment

Uh oh!

Nicole00 Aug 16, 2024

Choose a reason for hiding this comment

Uh oh!

awang12345 Aug 17, 2024

Choose a reason for hiding this comment

Uh oh!

Nicole00 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!