Skip to content

CEPP-KMITL/ThaiRoad-Spark

Repository files navigation

ThaiRoad-Spark

A big data service for ThaiRoad made up of Apache Spark on bitnami Docker

Run Locally

Prefer Powershell for following command

Start the service (in detached mode)

  docker-compose up -d --scale spark-worker=$number_of_worker_node
Parameter Type Description
number_of_worker_node int Required Number of worker that will be in the cluster.

Spark Admin

  http://localhost:8080/

Access Spark's shell with Spark installation

  cd spark-3.2.1-bin-hadoop3.2/bin
  ./spark-shell $spark_master_url
Parameter Type Description
spark_master_url string Required Your spark master url for spark admin. Example: spark://5bbbc7e372fa:7077

Access Spark's shell from inside container

  docker exec -it $spark_conatiner_id_or_name bash
Parameter Type Description
spark_conatiner_id_or_name string Required Your spark master container id or container name. Example: 5bbbc7e372fa or spark-master.

Submit an application to the cluster for processing (Need to perform inside spark master container)

  spark-submit $path_to_file
Parameter Type Description
path_to_file string Required Absolute path to file which you want to submit.

Docker Compose Reference

Container Specification

Container Name Port Description
spark-master 8080,7077 Required Spark Master Container with Spark Admin.
spark-worker - Required Spark Worker.

Worker Specification

Specification Parameter Value
SPARK_WORKER_MEMORY 1G
SPARK_WORKER_CORES 1

Acknowledgements

About

Service for handling big data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages