A minimal-as-possible Docker container running Apache Hive on Hadoop. Intended for non-production use cases like testing out Hive code or running integration tests.
-
Install Docker.
-
Make sure that you have at least a few GB of memory allocated to Docker. Instructions:
-
Clone this repository.
-
From the repository root, build the Docker image.
docker build -t weehive .
docker run --rm -it \
-v weehive_hadoop:/usr/local/hadoop/warehouse \
-v weehive_meta:/usr/local/hadoop/metastore_db \
weehive
You will be shown the Beeline shell. The weehive_hadoop
and weehive_meta
volume names can be changed to be project-specific names if you want.
-
Run the server.
docker run --rm -it -p 10000:10000 \ -v weehive_hadoop:/usr/local/hadoop/warehouse \ -v weehive_meta:/usr/local/hadoop/metastore_db \ weehive hiveserver2
-
Wait ~90 seconds for Hive to fully start.
-
Connect using the JDBC URL
jdbc:hive2://localhost:10000
. Example from an externalbeeline
:beeline -u jdbc:hive2://localhost:10000
- Mount the data as a volume by adding a
-v <sourcedir>:/usr/local/hadoop/data
to one of thedocker run
commands above. - Follow instructions to load data
docker build -t weehive:local .
docker run --rm -it weehive:local