Skip to content

developer-advocacy-dremio/dremio-demo-env-092024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Docker Compose Setup for Data Engineering with Nessie, MinIO, Spark, and Dremio

Instructions

How to Spin Up the Services

  1. Ensure Docker and Docker Compose are installed on your system.
  2. Navigate to the directory containing the docker-compose.yml file.
  3. Place Seed Data:
    • For MinIO, place the files to be seeded into the bucket in ./minio-data.
    • For Spark, place any notebooks or datasets in ./notebook-seed.
  4. Run the following command to start all the services:
   docker-compose up -d
  1. Once they are all setup, make sure to initialize superset in its container.
docker exec -it superset superset init

Blog on How to Connect Superset to Dremio

Everything should be up and running. You can access the services using the following URLs:

How to Spin Down the Services

To stop and remove the running containers, use the following command:

docker-compose down

This will stop all the services and remove the containers. Data stored in volumes (./nessie-data, ./minio-data, ./notebook-seed) will persist.

To clear volumes and remove all data, use the following command:

docker-compose down -v

Seed Data Locations

  • MinIO: Files placed in the ./minio-data folder on your host will be copied into the datalake bucket inside MinIO during startup.
  • Spark: The ./notebook-seed folder on your host is mounted to /workspace/seed-data inside the Spark container. You can place Jupyter notebooks or datasets in this folder to be available in the Spark environment.

Accessing the Services

Notes

Ensure that the appropriate ports (listed above) are open and not blocked by firewalls. The services will run in a shared Docker network called intro-network, allowing them to communicate with each other.

For persistent data storage, ensure the mounted directories (./nessie-data, ./minio-data, ./notebook-seed) exist on your local machine.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages