Skip to content

Commit

Permalink
fix
Browse files Browse the repository at this point in the history
  • Loading branch information
kecookier committed Mar 25, 2024
1 parent 5f145ac commit e7978cc
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ abstract class VeloxUdfSuite extends GlutenQueryTest with SQLHelper {
"/path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so")
}

protected lazy val udfLibRelativePath: String =
udfLibPath.split(",").map(p => Paths.get(p).getFileName.toString).mkString(",")

override protected def beforeAll(): Unit = {
super.beforeAll()
if (_spark == null) {
Expand Down Expand Up @@ -83,7 +86,7 @@ class VeloxUdfSuiteLocal extends VeloxUdfSuite {
override protected def sparkConf: SparkConf = {
super.sparkConf
.set("spark.files", udfLibPath)
.set("spark.gluten.sql.columnar.backend.velox.udfLibraryPaths", "libmyudf.so")
.set("spark.gluten.sql.columnar.backend.velox.udfLibraryPaths", udfLibRelativePath)
}
}

Expand Down
17 changes: 12 additions & 5 deletions docs/get-started/Velox.md
Original file line number Diff line number Diff line change
Expand Up @@ -434,23 +434,23 @@ Gluten loads the UDF libraries at runtime. You can upload UDF libraries via `--f

Note if running on Yarn client mode, the uploaded files are not reachable on driver side. Users should copy those files to somewhere reachable for driver and set `spark.gluten.sql.columnar.backend.velox.driver.udfLibraryPaths`. This configuration is also useful when the `udfLibraryPaths` is different between driver side and executor side.

- Use `--files`
- Use the `--files` option to upload a library and configure its relative path
```shell
--files /path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so
--conf spark.gluten.sql.columnar.backend.velox.udfLibraryPaths=libmyudf.so
# Needed for Yarn client mode
--conf spark.gluten.sql.columnar.backend.velox.driver.udfLibraryPaths=file:///path/to/libmyudf.so
--conf spark.gluten.sql.columnar.backend.velox.driver.udfLibraryPaths=file:///path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so
```

- Use `--archives`
- Use the `--archives` option to upload a archive and configure its relative path
```shell
--archives /path/to/udf_archives.zip#udf_archives
--conf spark.gluten.sql.columnar.backend.velox.udfLibraryPaths=udf_archives
# Needed for Yarn client mode
--conf spark.gluten.sql.columnar.backend.velox.driver.udfLibraryPaths=file:///path/to/udf_archives.zip
```

- Specify URI
- Only configure URI

You can also specify the local or HDFS URIs to the UDF libraries or archives. Local URIs should exist on driver and every worker nodes.
```shell
Expand All @@ -462,10 +462,17 @@ You can also specify the local or HDFS URIs to the UDF libraries or archives. Lo
We provided an Velox UDF example file [MyUDF.cpp](../../cpp/velox/udf/examples/MyUDF.cpp). After building gluten cpp, you can find the example library at /path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so

Start spark-shell or spark-sql with below configuration
```
```shell
# Use the `--files` option to upload a library and configure its relative path
--files /path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so
--conf spark.gluten.sql.columnar.backend.velox.udfLibraryPaths=libmyudf.so
```
or
```shell
# Only configure URI
--conf spark.gluten.sql.columnar.backend.velox.udfLibraryPaths=file:///path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so
```

Run query. The functions `myudf1` and `myudf2` increment the input value by a constant of 5
```
select myudf1(1), myudf2(100L)
Expand Down

0 comments on commit e7978cc

Please sign in to comment.