Skip to content
This repository has been archived by the owner on May 9, 2022. It is now read-only.

Commit

Permalink
Cherry pick from amirhhz's branch. Included other fixes and CDH 4.6 c…
Browse files Browse the repository at this point in the history
…hanges.
  • Loading branch information
analytically committed Mar 4, 2014
1 parent 441659c commit a01accb
Show file tree
Hide file tree
Showing 23 changed files with 88 additions and 41 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Hadoop Ansible Playbook [![Build Status](https://travis-ci.org/analytically/hadoop-ansible.png)](https://travis-ci.org/analytically/hadoop-ansible) [![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/analytically/hadoop-ansible/trend.png)](https://bitdeli.com/free "Bitdeli Badge")
=======================

[Ansible](http://www.ansibleworks.com/) playbook that installs a CDH 4.5 [Hadoop](http://hadoop.apache.org/)
[Ansible](http://www.ansibleworks.com/) playbook that installs a CDH 4.6.0 [Hadoop](http://hadoop.apache.org/)
cluster (running on Java 7, supported from [CDH 4.4](http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Release-Notes/Whats_New_in_4-4.html)),
with [HBase](http://hbase.apache.org/), Hive, [Presto](http://prestodb.io/) for analytics, and [Ganglia](http://ganglia.sourceforge.net/),
[Smokeping](http://oss.oetiker.ch/smokeping/), [Fluentd](http://fluentd.org/), [Elasticsearch](http://www.elasticsearch.org/)
Expand All @@ -11,7 +11,7 @@ Hire/Follow [@analytically](http://twitter.com/analytically). Browse the CI [bui

### Requirements

- [Ansible](http://www.ansibleworks.com/) 1.4 or later (`pip install ansible`)
- [Ansible](http://www.ansibleworks.com/) 1.5 or later (`pip install ansible`)
- 6 + 1 Ubuntu 12.04 LTS, 13.04 or 13.10 hosts - see [ubuntu-netboot-tftp](https://github.com/analytically/ubuntu-netboot-tftp) if you need automated server installation
- [Mandrill](http://mandrill.com/) username and API key for sending email notifications
- `ansibler` user in sudo group without sudo password prompt (see Bootstrapping section below)
Expand Down Expand Up @@ -138,7 +138,7 @@ Instructions on how to test the performance of your CDH4 cluster.

##### DFSIO

- `hadoop jar hadoop-mapreduce-client-jobclient-2.0.0-cdh4.5.0-tests.jar TestDFSIO -write`
- `hadoop jar hadoop-mapreduce-client-jobclient-2.0.0-cdh4.6.0-tests.jar TestDFSIO -write`

### Bootstrapping

Expand Down
5 changes: 4 additions & 1 deletion ansible.cfg
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
[defaults]
timeout = 20
timeout = 20

[ssh_connection]
pipelining=True
2 changes: 1 addition & 1 deletion bootstrap/bootstrap.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
hostname: name={{ inventory_hostname }}

- name: create user 'ansibler'
user: name=ansibler groups=sudo generate_ssh_key=yes shell=/bin/bash
user: name=ansibler groups=sudo shell=/bin/bash

- name: add 'ansibler' RSA SSH key
authorized_key: user=ansibler key="{{ authorized_rsa_key }}"
Expand Down
2 changes: 1 addition & 1 deletion group_vars/all
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ mtu: 9216
# mandrill_api_key: your_api_key

# Choose postgresql or mysql as hive metastore
metastore: postgresql
hive_metastore: postgresql
2 changes: 1 addition & 1 deletion hosts
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ hslave[01:04]
datanodes

[historyserver]
hslave01
hmaster01

# HBase Nodes
# ===========
Expand Down
12 changes: 12 additions & 0 deletions roles/cdh_hadoop_config/handlers/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,15 @@
- name: restart hadoop-hdfs-journalnode
service: name=hadoop-hdfs-journalnode state=restarted
ignore_errors: yes

- name: restart hadoop-mapreduce-historyserver
service: name=hadoop-mapreduce-historyserver state=restarted
ignore_errors: yes

- name: restart hadoop-yarn-nodemanager
service: name=hadoop-yarn-nodemanager state=restarted
ignore_errors: yes

- name: restart hadoop-yarn-resourcemanager
service: name=hadoop-yarn-resourcemanager state=restarted
ignore_errors: yes
3 changes: 3 additions & 0 deletions roles/cdh_hadoop_config/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@
- restart hadoop-hdfs-namenode
- restart hadoop-hdfs-journalnode
- restart hadoop-hdfs-datanode
- restart hadoop-mapreduce-historyserver
- restart hadoop-yarn-nodemanager
- restart hadoop-yarn-resourcemanager
tags:
- hadoop
- configuration
Expand Down
11 changes: 11 additions & 0 deletions roles/cdh_hadoop_config/templates/mapred-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,11 @@
<value>yarn</value>
</property>

<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>

<property>
<name>mapreduce.jobhistory.address</name>
<value>{{ hostvars[groups['historyserver'][0]]['ansible_fqdn'] }}:10020</value>
Expand Down Expand Up @@ -73,4 +78,10 @@
<name>mapreduce.output.fileoutputformat.compress.type</name>
<value>BLOCK</value>
</property>

<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024m</value>
<description>Higher Java heap for mapper to work.</description>
</property>
</configuration>
20 changes: 16 additions & 4 deletions roles/cdh_hadoop_config/templates/yarn-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,22 @@
<!-- {{ ansible_managed }} -->

<configuration>
<!-- CPU Cores -->
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>{{ ansible_processor_count * ansible_processor_cores * ansible_processor_threads_per_core }}</value>
</property>

<!-- Memory limits -->
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>{{ ansible_memtotal_mb - 1024 }}</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>{{ ansible_memtotal_mb - 1024 }}</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>{{ hostvars[groups['resourcemanager'][0]]['ansible_fqdn'] }}:8031</value>
Expand Down Expand Up @@ -56,10 +72,6 @@
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/var/log/hadoop-yarn/apps</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>

<!-- Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal
share of resources over time. When there is a single job running, that job uses the entire cluster. -->
Expand Down
6 changes: 3 additions & 3 deletions roles/cdh_hive_config/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@

- name: soft link postgresql-jdbc4.jar
shell: creates=/usr/lib/hive/lib/postgresql-jdbc4.jar ln -s /usr/share/java/postgresql-jdbc4.jar /usr/lib/hive/lib/postgresql-jdbc4.jar
when: metastore == "postgresql"
when: hive_metastore == "postgresql"
tags: hive


- name: symbolically link libmysql-java.jar
shell: creates=/usr/lib/hive/lib/libmysql-java.jar
ln -s /usr/share/java/mysql-connector-java.jar /usr/lib/hive/lib/libmysql-java.jar
when: metastore == "mysql"
when: hive_metastore == "mysql"
tags: hive
9 changes: 5 additions & 4 deletions roles/cdh_hive_config/templates/hive-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,26 @@
See the License for the specific language governing permissions and
limitations under the License.
-->
<!-- {{ ansible_managed }} -->
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>
<name>javax.jdo.option.ConnectionURL</name>
{% if hostvars[groups["hive_metastore"][0]]["ansible_fqdn"] == ansible_fqdn %}
<value>jdbc:{{ metastore }}://localhost/metastore</value>
<value>jdbc:{{ hive_metastore }}://localhost/metastore</value>
{% else %}
<value>jdbc:{{ metastore }}://{{ hostvars[groups["hive_metastore"][0]]["ansible_fqdn"] }}/metastore</value>
<value>jdbc:{{ hive_metastore }}://{{ hostvars[groups["hive_metastore"][0]]["ansible_fqdn"] }}/metastore</value>
{% endif %}
</property>

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
{% if metastore == postgresql %}
{% if hive_metastore == "postgresql" %}
<value>org.postgresql.Driver</value>
{% endif %}
{% if metastore == mysql %}
{% if hive_metastore == "mysql" %}
<value>com.mysql.jdbc.Driver</value>
{% endif %}
</property>
Expand Down
2 changes: 1 addition & 1 deletion roles/cdh_zookeeper_server/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
- name: install zookeeper-server via apt
apt: name={{ item }} force=yes update-cache=yes
with_items:
- zookeeper=3.4.5+24-1.cdh4.5.0.p0.23~precise-cdh4.5.0
- zookeeper=3.4.5+25-1.cdh4.6.0.p0.12~precise-cdh4.6.0
- zookeeper-server
tags: zookeeper

Expand Down
1 change: 1 addition & 0 deletions roles/common/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@

- name: create the hosts file for all machines
template: backup=yes src=hosts dest=/etc/hosts
tags: configuration

- name: configure landscape-sysinfo to hide link
template: src=client.conf dest=/etc/landscape/client.conf owner=root group=root mode=0644
Expand Down
11 changes: 3 additions & 8 deletions roles/mysql_server/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,10 @@
# file: roles/mysql_server/tasks/main.yml

- name: predefine mysql root passwd
shell: executable=/bin/bash
create=/root/.mysql-root-passwd-configured
debconf-set-selections <<< "mysql-server-5.5 mysql-server/root_password password hive_{{ site_name }}"
shell: executable=/bin/bash debconf-set-selections <<< "mysql-server-5.5 mysql-server/root_password password hive_{{ site_name }}"

- name: predefine mysql root passwd confirm
shell: executable=/bin/bash
create=/root/.mysql-root-passwd-again-configured
debconf-set-selections <<< "mysql-server-5.5 mysql-server/root_password_again password hive_{{ site_name }}"
shell: executable=/bin/bash debconf-set-selections <<< "mysql-server-5.5 mysql-server/root_password_again password hive_{{ site_name }}"

- name: install MySQL and related-package via apt
apt: name={{ item }}
Expand All @@ -20,8 +16,7 @@
- python-mysqldb

- name: service mysql restart
service: name=mysql
state=restarted
service: name=mysql state=restarted

- name: create mysql db 'metastore'
mysql_db: name=metastore
Expand Down
12 changes: 9 additions & 3 deletions roles/presto_common/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,20 +22,26 @@
with_items:
- node.properties
- jvm.config
tags: presto
tags:
- configuration
- presto

- name: configure presto hive catalog in /usr/lib/presto/etc/catalog
template: src={{ item }} dest=/usr/lib/presto/etc/catalog/{{ item }} owner=root group=root mode=0644
with_items:
- jmx.properties
- hive.properties
- tpch.properties
tags: presto
tags:
- configuration
- presto

- name: install presto command line client
shell: chdir=/usr/bin curl -o presto http://central.maven.org/maven2/com/facebook/presto/presto-cli/0.60/presto-cli-0.60-executable.jar && chmod +x presto
tags: presto

- name: install upstart config for presto
template: src=upstart.conf dest=/etc/init/presto.conf owner=root group=root mode=0644
tags: presto
tags:
- configuration
- presto
2 changes: 1 addition & 1 deletion roles/presto_common/templates/hive.properties
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
connector.name=hive-cdh4
hive.metastore.uri=thrift://{{ hostvars[groups["hive_metastore"][0]]["ansible_hostname"] }}:9083
hive.metastore.uri=thrift://{{ hostvars[groups["hive_metastore"][0]]["ansible_fqdn"] }}:9083
1 change: 0 additions & 1 deletion roles/presto_common/templates/upstart.conf
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
description "Facebook Presto"
author "Mathias Bogaert"

start on started hive-metastore
stop on stopped hive-metastore
Expand Down
4 changes: 3 additions & 1 deletion roles/presto_coordinator/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,6 @@
- config.properties
notify:
- restart presto
tags: presto
tags:
- configuration
- presto
4 changes: 2 additions & 2 deletions roles/presto_coordinator/templates/config.properties
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
coordinator=true
datasources=jmx
http-server.http.port=8080
http-server.http.port=8081
presto-metastore.db.type=h2
presto-metastore.db.filename=var/db/MetaStore
task.max-memory=1GB
discovery-server.enabled=true
discovery.uri=http://{{ hostvars[groups["presto_coordinators"][0]]["ansible_hostname"] }}:8080
discovery.uri=http://{{ hostvars[groups["presto_coordinators"][0]]["ansible_fqdn"] }}:8081
4 changes: 3 additions & 1 deletion roles/presto_worker/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,6 @@
- config.properties
notify:
- restart presto
tags: presto
tags:
- configuration
- presto
4 changes: 2 additions & 2 deletions roles/presto_worker/templates/config.properties
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
coordinator=false
datasources=jmx,hive,tpch
http-server.http.port=8080
http-server.http.port=8081
presto-metastore.db.type=h2
presto-metastore.db.filename=var/db/MetaStore
task.max-memory=1GB
discovery.uri=http://{{ hostvars[groups["presto_coordinators"][0]]["ansible_hostname"] }}:8080
discovery.uri=http://{{ hostvars[groups["presto_coordinators"][0]]["ansible_fqdn"] }}:8081
4 changes: 2 additions & 2 deletions site.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ export ANSIBLE_SSH_ARGS="-o ForwardAgent=yes"
if [ $# -gt 0 ]
then
echo 'Running ansible-playbook -i hosts site.yml --tags' "$*"
ansible-playbook -i hosts --extra-vars "accelerate=true" site.yml --tags "$*"
ansible-playbook -i hosts site.yml --tags "$*"
else
echo 'Running ansible-playbook -i hosts site.yml'
ansible-playbook -i hosts --extra-vars "accelerate=true" site.yml
ansible-playbook -i hosts site.yml
fi
2 changes: 1 addition & 1 deletion site.yml
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@
tags:
- hadoop
- name: make sure the /user/history directory has the correct owner
shell: creates=/data/dfs/.historydirchowned sleep 5 && hadoop fs -chown yarn /user/history && touch /data/dfs/.historydirchowned
shell: creates=/data/dfs/.historydirchowned sleep 5 && hadoop fs -chown yarn:mapred /user/history && touch /data/dfs/.historydirchowned
tags:
- hadoop
- name: create the /hbase directory on the cluster
Expand Down

0 comments on commit a01accb

Please sign in to comment.