Hadoop Docker

Build the image

If you'd like to pull this image directly from the Docker hub you can build the image as:

Using caches :

docker build  -t cnp2 .

Don't use caches : If you want to refresh config files in your image use below command to build your image. This ignores any cache for line after ARG RECONFIG=1 and execute every line. It is useful if you want to change config files.

docker build -t cnp2 --build-arg RECONFIG=$(date +%s) .

Run the image :

There are two ways to run hadoop single-node and multi-node (default config of image is single-node). You can change image behavior with some environment variable and change it to multi-node.

Single-node

Just run this:

docker run -it cnp2

Multi-node

First create the net:

docker network create --subnet=172.20.0.0/16 hadoop-cluster

Then run the created container:

docker run --net hadoop-cluster --ip 172.20.0.22 -it ubuntu bash

At last, run slave and master nodes:

docker run --net hadoop-cluster --ip 172.20.0.11 -it -e HADOOP_HOSTS="172.20.0.10 master,172.20.0.11 slave1" cnp2
docker run --net hadoop-cluster --ip 172.20.0.10 -it -e HADOOP_HOSTS="172.20.0.10 master,172.20.0.11 slave1" -e MY_ROLE="master" cnp2

Run Map-Reduce

Compile

hadoop com.sun.tools.javac.Main WordCount.java

Create Jar file

jar cf wc.jar WordCount*.class

Run

hadoop jar wc.jar WordCount /user/sina/data /user/sina/output

You can monitor progress on MapReduce Job Monitoring (port 8088) and HDFS Monitoring (port 50070). Also, you may use Datanode (port 50075) or MapReduce JobHistory Server (port 19888).

See Output

hdfs dfs -ls /user/sina/output

To see a list of available File System Shell's commands, see here.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
mapred		mapred
.gitignore		.gitignore
Dockerfile		Dockerfile
HdfsReader.java		HdfsReader.java
HdfsWriter.java		HdfsWriter.java
LICENSE		LICENSE
README.md		README.md
WordCount.java		WordCount.java
cluster.py		cluster.py
core-site.multi-node.xml		core-site.multi-node.xml
core-site.xml		core-site.xml
hadoop-env.sh		hadoop-env.sh
hdfs-site.multi-node.xml		hdfs-site.multi-node.xml
hdfs-site.xml		hdfs-site.xml
mahdiz.big		mahdiz.big
mapred-site.multi-node.xml		mapred-site.multi-node.xml
mapred-site.xml		mapred-site.xml
start.sh		start.sh
stop-words.keys		stop-words.keys
yarn-site.multi-node.xml		yarn-site.multi-node.xml
yarn-site.xml		yarn-site.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hadoop Docker

Build the image

Run the image :

Single-node

Multi-node

Run Map-Reduce

About

Releases

Packages

Contributors 3

Languages

License

soreana/hadoop-docker

Folders and files

Latest commit

History

Repository files navigation

Hadoop Docker

Build the image

Run the image :

Single-node

Multi-node

Run Map-Reduce

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages