Skip to content

Usage Guide

Chaffelson edited this page Sep 24, 2018 · 5 revisions

Overview

This thin Orchestrator (whoville/mayor.py) uses a bootstrap file (Profile.yml) and a set of opinionated defaults to procedurally deploy the Hortonworks Platform. It holds the current state of the process in a singleton configuration object (horton) and steps through a series of create-if-not-exist stages to start from a simple set of credentials and complete full demo deployments.

The Orchestrator follows these high level steps ( summarised from the python code):

step_1_init_service
Validate Profile File
Load Resources from Github or Local files, as defined in Profile, into horton.resources
Parse Yaml Definition files, as found in Resources, into horton.defs
step_2_init_infra
Create-If-Not-Exists Cloudbreak service in the target Hosting Platform, as defined in Profile
Connect and Authenticate to Cloudbreak, presented in horton.cbd
Validate and Create supplied Deployment Credential, presented in horton.cred
step_3_sequencing
Examine each Definition in horton.defs, if a Priority is set, load by Priority the defined Sequence of deployment steps into horton.seq
step_4_build
Sort horton.seq by Priority key, begin processing
For each step in the current Sequence, determine and execute the defined action

Actions and Definitions are defined further down this document

Resource Definitions

Infrastructure
A Cloudbreak 'Stack' consisting of one or more servers deployed via Ambari.
Services
A combination of one or more Infrastructure components, usually with additional steps and controls to bind them together
Both types of resource definition share a common Yaml format, with the selection of included fields dictating the behavior.
Under a resource root defined in your Profile (profile:resources), the Orchestrator expects to find one or more folders named for the Service Definition contained within (e.g. hdp3), and each of those folders must contain a yaml file of the same name defining the service (e.g hdp3/hdp3.yaml)
The folder should also contain all the additional blueprints, scripts, etc. that the Definition calls for (e.g. hdp3/blueprint.json, hdp3/deploy.sh)
/resources
/resources/hdp3
/resources/hdp3/hdp3.yaml
/resources/hdp3/blueprint.json
/resources/hdp3/recipe.sh
/resources/hdp3/... etc.

Infrastructure Definition

The Infrstructure Definition declares the parameters to be passed to Cloudbreak in order to create a particular Stack.
It is deliberately structured to be very similar to the Cloudbreak Json definition in order to make both parsing and overriding settings easy
Note that while many Fields and Settings are supported, many are not. Please raise an Issue if you need something added or changed.
Here is an example:
deploywait: 1800
purge: false
input:
  yarn.vcores: 6
infra:
  disktypes: [gp2, standard]
  os: redhat7
  ambarirepo:
    version: 2.7.0.0
    baseUrl: http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.0.0
    gpgKeyUrl: http://public-repo-1.hortonworks.com/ambari/centos7/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
  stackrepo:
    ver: 3.0.0.0-1634
    url: http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.0.0.0/HDP-3.0.0.0-1634.xml
  cloudstor:
  - value: /apps/hive/warehouse
    propfile: hive-site
    propname: hive.metastore.warehouse.dir
blueprint:
  name: blueprint.json
recipe:
- name: remount-storage.sh
  typ: PRE_AMBARI_START
group:
- name: master
  recipe: [remount-storage]
  type: GATEWAY
  machine: m5.2xlarge
  nodes: 1
mpack:
- name: hdf320
  desc: HDF 3.2.0
  url: http://public-repo-1.hortonworks.com/HDF/centos7/3.x/updates/3.2.0.0/tars/hdf_ambari_mp/hdf-ambari-mpack-3.2.0.0-520.tar.gz
  purge_on_install: false
deploywait
How long, in seconds, the Orchestrator should wait before timing out on deploying this definition
purge
Controls whether any existing components should be deleted during a deployment run
Relies on the Namespace in your Profile to select the correct set of components
input
Any key:value inputs to be passed into Cloudbreak for the Ambari blueprint deployment process
infra
Parameters for Cloudbreak's infrastructure selection process
disktypes: List of Strings of disk types to select from on the target Hosting Platform
os: String of the OS type to select from the available images
ambarirepo / stackrepo: overrides for the defaults
cloudstor: params to be used when configuring Cloudbreak to mount Cloud Storage for a particular service, matching your Profile objectstor settings
blueprint
The name of the file containing the Ambari Blueprint for Cloudbreak to use
recipe
List of Cloudbreak deployment recipes and the stage they should be run at
group
List of infrastructure groups matching your Blueprint
mpack
List of mpack definitions to pass into Cloudbreak for installing on the Ambari node

Service Definition

The Service Definition declares the controls and steps required to chain together one or more Infrastructure Definitions for a deployment.
Here is an example:
purge: false
priority: 1
seq:
- openport: [-1, 1, 65534, '10.20.30.40/32']
- openport: [tcp, 22, 22, '0.0.0.0/0']
- prepdeps: [hdp3, hdp3]
- prepspec: [hdp3, hdp3]
- deploy: [hdp3]
- wait: [hdp3, hdp3, event_type, BILLING_STARTED]
- writecache: [hdp3, public_ip, PUBLICIP]
- replace: [demo, deploy.sh, PUBLICIP]
purge
Controls whether any existing components should be deleted during a deployment run
Relies on the Namespace in your Profile to select the correct set of components
Note that this will also delete any reused components from other definitions with the same name
priority
If this field is populated, the Orchestrator will attempt to deploy this definition
If multiple definitions are present, the Orchestrator will attemp to alpha sort the priority field to determine the desired sequence, so using 1,2,3,4,etc. is recommended
seq
The list of steps under seq will be executed in order written until the process completes or an error is encountered
The steps should be constructed such that the process can be run repeatedly without adverse effects
The list supports a limited number of primitive instructions as defined in the Orchestrator

Sequence Primitives

Sequence primitives may be used to declare the action in each step of the seq field of a definition yaml

openport

[Protocol, StartPort, EndPort, CIDR]

[tcp, udp, -1 (all)]; e.g. tcp
any port 1-65534; e.g 443
any port 1-65534; e.g 443
any IP / mask; e.g. 192.168.0.1/16
prepdeps

[DefinitionName, TargetName]

the Name Key of the Definition; e.g. hdp3
the Cluster Name to append to the Profile Namespace; e.g. my-hdp3-cluster
prepspec

[DefinitionName, TargetName]

the Name Key of the Definition; e.g. hdp3
the Cluster Name to append to the Profile Namespace; e.g. my-hdp3-cluster
deploy

[TargetName]

List of TargetNames to deploy in this Step; e.g. [hdp265-a, hdp265-b]
wait

[DefinitionName, TargetName, EventField, EventString]

the Name Key of the Definition; e.g. hdp3
the Cluster Name to append to the Profile Namespace; e.g. my-hdp3-cluster
the String of the Event Field to filter on; e.g. event_type
the String in the Event Field you are waiting for; e.g. BILLING_STARTED
writecache

[TargetName, Field, CacheKey]

The Name of the Cluster to retrieve the value from; e.g. my-hdp3-cluster
The name of the supported Field to retrieve from; e.g. public_ip
The Key to use when writing the value to Horton's Cache; e.g. DPSPUBLICIP
replace

`[DefinitionName, ResourceFile, CacheKey]

The Name Key of the Definition to modify; e.g. hdp3
The filename of the Resource within the Named Definition to modify; e.g. deploy.sh
The Key of the value to replace in the file, should match the string in the file to be replaced; e.g. PUBLICIP
Clone this wiki locally