A big data service for ThaiRoad made up of Apache Spark on bitnami Docker
Prefer Powershell for following command
Start the service (in detached mode)
docker-compose up -d --scale spark-worker=$number_of_worker_node
Parameter | Type | Description |
---|---|---|
number_of_worker_node |
int |
Required Number of worker that will be in the cluster. |
Spark Admin
http://localhost:8080/
Access Spark's shell with Spark installation
cd spark-3.2.1-bin-hadoop3.2/bin
./spark-shell $spark_master_url
Parameter | Type | Description |
---|---|---|
spark_master_url |
string |
Required Your spark master url for spark admin. Example: spark://5bbbc7e372fa:7077 |
Access Spark's shell from inside container
docker exec -it $spark_conatiner_id_or_name bash
Parameter | Type | Description |
---|---|---|
spark_conatiner_id_or_name |
string |
Required Your spark master container id or container name. Example: 5bbbc7e372fa or spark-master. |
Submit an application to the cluster for processing (Need to perform inside spark master container)
spark-submit $path_to_file
Parameter | Type | Description |
---|---|---|
path_to_file |
string |
Required Absolute path to file which you want to submit. |
Container Name | Port | Description |
---|---|---|
spark-master |
8080,7077 |
Required Spark Master Container with Spark Admin. |
spark-worker |
- |
Required Spark Worker. |
Specification Parameter | Value |
---|---|
SPARK_WORKER_MEMORY |
1G |
SPARK_WORKER_CORES |
1 |
- Spark Application | Setup IntelliJ IDE with SBT
- Apache Spark packaged by Bitnami
- Apache Spark packaged by Bitnami Github
- Apache SPARK Up and Running FAST with Docker
- Apache Spark local with Docker
- Apache SPARK Quick Start
- Spark Submit Command Explained with Examples
- Running Apache Spark Applications
- Running Scala Spark jobs on Bitnami Docker images
- Spark Client Mode vs Cluster Mode Differences
- Spark Client Mode Vs Cluster Mode
- Group: Apache Spark