-
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
253 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,4 @@ | ||
TODO | ||
It will be ready as soon as possible after Relase 0.1.0, until then you can refer to the source code file [sparglim/config/configer.py](sparglim/config/configer.py) | ||
|
||
|
||
TODO: Generate avaliable envs for config spark session |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
This is development verification for [examples/jupyter-sparglim-on-k8s](../../examples/jupyter-sparglim-on-k8s) | ||
|
||
Use [docker/Dockerfile.jupyterlab-sparglim](../docker/Dockerfile.jupyterlab-sparglim) to build a dev version `jupyterlab-sparglim`. | ||
|
||
``` | ||
# In project root dir | ||
docker build -t wh1isper/jupyterlab-sparglim:dev -f dev/docker/Dockerfile.jupyterlab-sparglim . | ||
# reload by deleting deployment pod | ||
./dev/scripts/reload.sh | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
This is development verification for [examples/jupyter-sparglim-sc](../../examples/jupyter-sparglim-sc) | ||
|
||
Use [docker/Dockerfile.jupyterlab-sparglim](../docker/Dockerfile.jupyterlab-sparglim) and [docker/Dockerfile.sparglim-server](../docker/Dockerfile.sparglim-server) and to build a dev version `jupyterlab-sparglim` and `sparglim-server` | ||
|
||
``` | ||
# In project root dir | ||
docker build -t wh1isper/jupyterlab-sparglim:dev -f dev/docker/Dockerfile.jupyterlab-sparglim . | ||
docker build -t wh1isper/sparglim-server:dev -f dev/docker/Dockerfile.sparglim-server . | ||
# reload by deleting deployment pod | ||
./dev/scripts/reload.sh | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
This is development verification for [examples/sparglim-server](../../examples/sparglim-server) | ||
|
||
Use [docker/Dockerfile.sparglim-server](../docker/Dockerfile.sparglim-server) to build a dev version `sparglim-server` | ||
|
||
``` | ||
# In project root dir | ||
docker build -t wh1isper/sparglim-server:dev -f dev/docker/Dockerfile.sparglim-server . | ||
# reload by deleting deployment pod | ||
./dev/scripts/reload.sh | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
In [Quick Start](../../README.md#quick-start), we start a `local[*]` PySpark Session for data explorations in JupyterLab. This example is using `spark on k8s for same purpose`. | ||
|
||
# Prepare | ||
|
||
## Namespace: `sparglim` | ||
|
||
``` | ||
kubectl create ns sparglim | ||
``` | ||
|
||
## Grant authorization | ||
|
||
You need to authorize the pod so that it can create pods(executor) | ||
|
||
For a simple test, you can grant administrator privileges to all pods using the following command (**DO NOT this in a production environment**) | ||
|
||
``` | ||
kubectl create clusterrolebinding serviceaccounts-cluster-admin | ||
--clusterrole=cluster-admin | ||
--group=system:serviceaccounts | ||
``` | ||
|
||
# Apply and access | ||
|
||
``` | ||
# In project root | ||
kubectl apply -f example/jupyter-sparglim-on-k8s/k8s | ||
``` | ||
|
||
Check pod is running: | ||
|
||
``` | ||
$: kubectl get pod -n sparglim | ||
NAME READY STATUS RESTARTS AGE | ||
sparglim-app-5499f54f6b-gk4xv 1/1 Running 0 33m | ||
``` | ||
|
||
Access JupyterLab and try it out: | ||
|
||
`http://<master-ip>:30888` | ||
|
||
# Usage | ||
|
||
## Code | ||
|
||
Using code for `spark on k8s` initialization | ||
|
||
``` | ||
from sparglim.config.builder import ConfigBuilder | ||
spark = ConfigBuilder().config_k8s().get_or_create() | ||
``` | ||
|
||
When SparkSession created, check executor is up:` kubectl get pod -n sparglim` | ||
|
||
``` | ||
NAME READY STATUS RESTARTS AGE | ||
sparglim-825bf989955f3593-exec-1 1/1 Running 0 53m | ||
sparglim-825bf989955f3593-exec-2 1/1 Running 0 53m | ||
sparglim-825bf989955f3593-exec-3 1/1 Running 0 53m | ||
sparglim-app-8495f7b796-2h7sc 1/1 Running 0 53m | ||
``` | ||
|
||
## SQL | ||
|
||
This will auto config SparkSession to `k8s` mode, via env `SPARGLIM_SQL_MODE` | ||
|
||
```python | ||
%load_ext sparglim.sql | ||
from sparglim.config.builder import ConfigBuilder | ||
spark = ConfigBuilder().get_or_create() # No need to config_k8s(), ConfigBuilder is a Singleton | ||
``` | ||
|
||
Test it: | ||
|
||
```python | ||
%sql SHOW TABLES; | ||
``` | ||
|
||
|
||
When SparkSession created, check executor is up: `kubectl get pod -n sparglim` | ||
|
||
``` | ||
NAME READY STATUS RESTARTS AGE | ||
sparglim-825bf989955f3593-exec-1 1/1 Running 0 53m | ||
sparglim-825bf989955f3593-exec-2 1/1 Running 0 53m | ||
sparglim-825bf989955f3593-exec-3 1/1 Running 0 53m | ||
sparglim-app-8495f7b796-2h7sc 1/1 Running 0 53m | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
In [Quick Start](../../README.md#quick-start), we start a `local[*]` PySpark Session for data explorations in JupyterLab, and a `local[*]` Spark Connect Server. This example will combine both of the above on k8s: A PySpark Connect client from JupyterLab on k8s, connect to a Spark Connect Server on k8s. | ||
|
||
# Prepare | ||
|
||
## Namespace: `sparglim` | ||
|
||
``` | ||
kubectl create ns sparglim | ||
``` | ||
|
||
## Grant authorization | ||
|
||
You need to authorize the pod so that it can create pods(executor) | ||
|
||
For a simple test, you can grant administrator privileges to all pods using the following command (**DO NOT this in a production environment**) | ||
|
||
``` | ||
kubectl create clusterrolebinding serviceaccounts-cluster-admin | ||
--clusterrole=cluster-admin | ||
--group=system:serviceaccounts | ||
``` | ||
|
||
# Apply and access | ||
|
||
``` | ||
# In project root | ||
kubectl apply -f example/jupyter-sparglim-sc/k8s/jupyter-sparglim/ | ||
kubectl apply -f example/jupyter-sparglim-sc/k8s/sparglim-server/ | ||
``` | ||
|
||
Check pod is running: | ||
|
||
``` | ||
$: kubectl get pod -n sparglim | ||
NAME READY STATUS RESTARTS AGE | ||
sparglim-app-5499f54f6b-gk4xv 1/1 Running 0 33m | ||
``` | ||
|
||
Access JupyterLab and try it out: | ||
|
||
`http://<master-ip>:30888` | ||
|
||
Access SparkUI: | ||
`http://<master-ip>:30040` | ||
|
||
# Usage | ||
|
||
## Code | ||
|
||
Using code for `spark on k8s` initialization | ||
|
||
```python | ||
from sparglim.config.builder import ConfigBuilder | ||
spark = ConfigBuilder().config_connect_client().get_or_create() | ||
``` | ||
|
||
## SQL | ||
|
||
This will auto config SparkSession to `connect_client` mode, via env `SPARGLIM_SQL_MODE` | ||
|
||
```python | ||
%load_ext sparglim.sql | ||
from sparglim.config.builder import ConfigBuilder | ||
spark = ConfigBuilder().get_or_create() # No need to config_connect_client(), ConfigBuilder is a Singleton | ||
``` | ||
|
||
Test it: | ||
|
||
```python | ||
%sql SHOW TABLES | ||
``` | ||
|
||
# TIPS | ||
|
||
Any configuration on the client side, such as `spark.sql.repl.eagerEval.enabled=true`, is not effective. So `%sql`(`%%sql`) can't display the dataframe. You can use `df.show()` instead. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters