Skip to content

Commit

Permalink
DSOS-2093: cloudwatch-agent plus some general improvements (#364)
Browse files Browse the repository at this point in the history
* add OL8.5 group vars

* simplify cloudwatch agent installation

* Fix for RHEL6

* Update README

* readme

* don't install cloudwatch agent in startup

* update amazon-cloudwatch-agent

* fix
  • Loading branch information
drobinson-moj authored Oct 16, 2023
1 parent 98188f7 commit 88d682c
Show file tree
Hide file tree
Showing 16 changed files with 148 additions and 105 deletions.
31 changes: 31 additions & 0 deletions ansible/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,16 @@ Use `user_data` to provide a cloud init or shell script which runs
ansible. See nomis ansible template scripts in [modernisation-platform-environments](https://github.com/ministryofjustice/modernisation-platform-environments/tree/main/terraform/environments/nomis/templates/) for an example. This relies on
tags to identify which roles to run.

## Running ansible locally on a linux EC2 instance

The `ansible-script` role installs a wrapper script ansible.sh in the /root/ directory.
Use this to run ansible within a virtual environment pulling in appropriate group_vars.
For example:

```
/root/ansible.sh site.yml --tags ec2patch
```

## Installing on Mac

Ensure you have python3.6+ installed on your local mac.
Expand Down Expand Up @@ -84,6 +94,8 @@ A generic [site.yml](/ansible/site.yml) is provided with dynamic inventories
under [hosts/](/ansible/hosts/) folder. This creates groups based of the following
tags:

- ami
- os-type
- environment-name
- server-type

Expand Down Expand Up @@ -133,3 +145,22 @@ ansible-playbook site.yml -e "role=amazon-cloudwatch-agent"
# Run locally (the comma after localhost is important)
ansible-playbook site.yml --connection=local -i localhost, -e "target=localhost" -e "@group_vars/server_type_nomis_db.yml" --check
```

## Gotchas for RHEL6

The ansible.builtin.yum task misbehaves when running from local MacOS on a RHEL6 server.
Run ansible locally on the server instead. Example error message when running on MacOS:

```
TASK [amazon-cloudwatch-agent : Install amazon-cloudwatch-agent] **********************************************************************************************
fatal: [xxx]: FAILED! => {"changed": false, "msg": "ansible-core requires a minimum of Python2 version 2.7 or Python3 version 3.5. Current version: 2.6.6 (r266:84292, May 31 2023, 09:01:24) [GCC 4.4.7 20120313 (Red Hat 4.4.7-23)]"}
```

The `galaxy.ansible.com` recent updates have broken collection installation on RHEL6.
Use requirements.rhel6.yml instead. Example error:

```
# [WARNING]: Skipping Galaxy server https://galaxy.ansible.com/api/. Got an unexpected error when getting available versions of collection amazon.aws:
# '/api/v3/plugin/ansible/content/published/collections/index/amazon/aws/versions/'
# ERROR! Unexpected Exception, this is probably a bug: '/api/v3/plugin/ansible/content/published/collections/index/amazon/aws/versions/'
```
12 changes: 12 additions & 0 deletions ansible/group_vars/server_type_base_ol85.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
ansible_python_interpreter: python3.9

server_type_roles_list:
- autoscale-group-hooks
- get-ec2-facts
- set-ec2-hostname
- domain-search
- ansible-script
- autoscale-group-hooks-state

roles_list: "{{ (ami_roles_list | default([]) | difference(server_type_roles_list | default([]))) + (server_type_roles_list | default([])) }}"
1 change: 0 additions & 1 deletion ansible/group_vars/server_type_base_rhel610.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ server_type_roles_list:
- autoscale-group-hooks
- set-ec2-hostname
- domain-search
- amazon-cloudwatch-agent
- autoscale-group-hooks-state
- ansible-script

Expand Down
1 change: 0 additions & 1 deletion ansible/group_vars/server_type_base_rhel79.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ server_type_roles_list:
- autoscale-group-hooks
- set-ec2-hostname
- domain-search
- amazon-cloudwatch-agent
- ansible-script
- autoscale-group-hooks-state

Expand Down
1 change: 0 additions & 1 deletion ansible/group_vars/server_type_base_rhel85.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ server_type_roles_list:
- get-ec2-facts
- set-ec2-hostname
- domain-search
- amazon-cloudwatch-agent
- ansible-script
- autoscale-group-hooks-state

Expand Down
27 changes: 27 additions & 0 deletions ansible/requirements.rhel6.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# https://galaxy.ansible.com broken on RHEL6 for some collections with following error:
# [WARNING]: Skipping Galaxy server https://galaxy.ansible.com/api/. Got an unexpected error when getting available versions of collection amazon.aws:
# '/api/v3/plugin/ansible/content/published/collections/index/amazon/aws/versions/'
# ERROR! Unexpected Exception, this is probably a bug: '/api/v3/plugin/ansible/content/published/collections/index/amazon/aws/versions/'

# from Galaxy
roles:

collections:
- name: community.general
source: https://old-galaxy.ansible.com
version: 6.3.0
- name: amazon.aws
source: https://old-galaxy.ansible.com
version: 3.2.0
- name: community.aws
source: https://old-galaxy.ansible.com
version: 3.2.1
- name: ansible.posix
source: https://galaxy.ansible.com
version: 1.4.0
- name: ansible.windows
source: https://galaxy.ansible.com
version: 1.13.0
- name: community.windows
source: https://galaxy.ansible.com
version: 1.12.0
22 changes: 1 addition & 21 deletions ansible/roles/amazon-cloudwatch-agent/README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,8 @@
# FIXME: this page needs an extensive re-write

# Cloudwatch Agent Role

This role installs the Cloudwatch Agent on a Linux host and configures it to send metrics to Cloudwatch.

If the group_vars for a host has the variable `cloudwatch_agent_configs` defined then this will deploy additional cloudwatch agent config files to the host. See files in /templates for examples.

Amazon Cloudwatch Agent config exection and start order is:

1. ansible_system == 'linux' (the default ansible_system i.e. linux) via `/templates/linux.json.j2`
2. loops through values of `cloudwatch_agent_configs` in group_vars and deploys them to the host

e.g. if you have a group_vars entry like this:

```
cloudwatch_agent_configs:
- nomis-db
```

then the file `templates/nomis-db.json.j2` will be deployed to the host.

# Cloudwatch Agent

Metrics sent to Cloudwatch will all appear in the default CWAgent namespace

## Debugging on Linux

ssm onto the machine/instance and run the following command to find out the running status of the agent:
Expand Down
6 changes: 6 additions & 0 deletions ansible/roles/amazon-cloudwatch-agent/defaults/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
amazon_cloudwatch_agent_config_name: linux.json.j2
amazon_cloudwatch_agent_gpg: https://amazoncloudwatch-agent-eu-west-2.s3.eu-west-2.amazonaws.com/assets/amazon-cloudwatch-agent.gpg
amazon_cloudwatch_agent_package: https://amazoncloudwatch-agent-eu-west-2.s3.eu-west-2.amazonaws.com/redhat/amd64/latest/amazon-cloudwatch-agent.rpm
amazon_cloudwatch_agent_config_file: amazon-cloudwatch-agent.json
amazon_cloudwatch_agent_config_path: /opt/aws/amazon-cloudwatch-agent/etc
5 changes: 5 additions & 0 deletions ansible/roles/amazon-cloudwatch-agent/handlers/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
- name: restart amazon-cloudwatch-agent
ansible.builtin.service:
name: amazon-cloudwatch-agent
state: restarted
30 changes: 7 additions & 23 deletions ansible/roles/amazon-cloudwatch-agent/tasks/configure.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,32 +7,16 @@
group: root
mode: 0755

- name: Get tag values for use in config file
set_fact:
name: "{{ ec2.tags.Name }}"
server_type: "{{ ec2.tags['server-type'] }}"
instance_id: "{{ ansible_ec2_instance_id }}"
- name: Fail if tags not defined
fail:
msg: "Please ensure Name tag is defined"
when: ec2.tags['Name'] is not defined or ansible_ec2_instance_id is not defined

# default cloudwatch-agent config file based on OS type (linux or windows)
# NOTE: windows default config doesn't exist, this is being handled in the Windows instance user-data in terraform
- name: Create OS specific amazon-cloudwatch-agent config file
- name: Create amazon-cloudwatch-agent config file
ansible.builtin.template:
src: "{{ ansible_system|lower }}.json.j2"
src: "{{ amazon_cloudwatch_agent_config_name }}"
dest: "{{ amazon_cloudwatch_agent_config_path }}/{{ amazon_cloudwatch_agent_config_file }}"
owner: root
group: root
mode: 0755

# additional settings for EC2 instances if they exist
- name: template config files
ansible.builtin.template:
src: "{{ item|replace('-', '_') }}.json.j2"
dest: "{{ amazon_cloudwatch_agent_config_path }}/{{ item|replace('-', '_') }}.json"
owner: root
group: root
mode: 0755
loop: "{{ cloudwatch_agent_configs }}"
loop_control:
label: item
when: cloudwatch_agent_configs is defined # currently none defined

notify: restart amazon-cloudwatch-agent
30 changes: 7 additions & 23 deletions ansible/roles/amazon-cloudwatch-agent/tasks/install.yml
Original file line number Diff line number Diff line change
@@ -1,29 +1,13 @@
---
- name: check if amazon-cloudwatch-agent installed
ansible.builtin.shell: |
check_installed() {
check="$(amazon-cloudwatch-agent-ctl -m ec2 -a status)"
if [[ $check ]]
then
return 0
else
return 1
fi
}
check_installed
ignore_errors: true
register: agent_installed
# Not bothering to install RPM key as RPM doesn't seem to be signed
# Plus it probably won't work if selinux enabled
# - name: Import amazon cloudwatch agent RPM key
# ansible.builtin.rpm_key:
# state: present
# key: "{{ amazon_cloudwatch_agent_gpg }}"

- name: Install amazon-cloudwatch-agent
ansible.builtin.yum:
name: "{{ amazon_cloudwatch_agent_package }}"
state: present
disable_gpg_check: true
when: agent_installed.rc == 1 and ansible_distribution_major_version != '6'

- name: Install amazon-cloudwatch-agent on Rhel 6
ansible.builtin.shell: |
wget https://s3.amazonaws.com/amazoncloudwatch-agent/redhat/amd64/latest/amazon-cloudwatch-agent.rpm
rpm -U ./amazon-cloudwatch-agent.rpm
become: true
when: agent_installed.rc == 1 and ansible_distribution_major_version == '6'
disable_gpg_check: true # RPM doesn't appear to be signed even through GPG key provided
9 changes: 9 additions & 0 deletions ansible/roles/amazon-cloudwatch-agent/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,27 @@
---
- import_tasks: "install.yml"
tags:
- amazon-cloudwatch-agent-install
- ec2provision
- ec2patch
when: ansible_distribution in ['RedHat', 'OracleLinux']

- import_tasks: "configure.yml"
tags:
- amazon-cloudwatch-agent-configure
- ec2provision
- ec2patch
when: ansible_distribution in ['RedHat', 'OracleLinux']

# Ensure any restarts done prior to start
- name: Flush handlers
meta: flush_handlers
tags:
- always

- import_tasks: "start.yml"
tags:
- amazon-cloudwatch-agent-start
- ec2provision
- ec2patch
when: ansible_distribution in ['RedHat', 'OracleLinux']
19 changes: 5 additions & 14 deletions ansible/roles/amazon-cloudwatch-agent/tasks/start.yml
Original file line number Diff line number Diff line change
@@ -1,15 +1,6 @@
---
# covers cloudwatch agent start for linux (common)
- name: Start amazon-cloudwatch-agent service
ansible.builtin.shell: |
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:"{{ amazon_cloudwatch_agent_config_path }}/{{ amazon_cloudwatch_agent_config_file }}"
# additional settings for EC2 instances if they exist
- name: Append settings for amazon-cloudwatch-agent for other EC2 instances
ansible.builtin.shell: |
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a append-config -m ec2 -s -c file:"{{ amazon_cloudwatch_agent_config_path }}/{{ item }}.json"
loop: "{{ cloudwatch_agent_configs|replace('-', '_') }}"
loop_control:
label: item
when: cloudwatch_agent_configs is defined # currently none defined
notify: restart amazon-cloudwatch-agent
- name: Start and enable amazon-cloudwatch-agent
ansible.builtin.service:
name: amazon-cloudwatch-agent
state: started
enabled: yes
48 changes: 34 additions & 14 deletions ansible/roles/amazon-cloudwatch-agent/templates/linux.json.j2
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,12 @@
"metrics_collection_interval": 60,
"totalcpu": false,
"append_dimensions": {
"name": "{{ name }}",
"server_type": "{{ server_type }}"
{% if ec2.tags['server-type'] is defined %}
"name": "{{ ec2.tags['Name'] }}",
"server_type": "{{ ec2.tags['server-type'] }}"
{% else %}
"name": "{{ ec2.tags['Name'] }}"
{% endif %}
}
},
"disk": {
Expand All @@ -38,8 +42,12 @@
"*"
],
"append_dimensions": {
"name": "{{ name }}",
"server_type": "{{ server_type }}"
{% if ec2.tags['server-type'] is defined %}
"name": "{{ ec2.tags['Name'] }}",
"server_type": "{{ ec2.tags['server-type'] }}"
{% else %}
"name": "{{ ec2.tags['Name'] }}"
{% endif %}
}
},
"diskio": {
Expand All @@ -51,8 +59,12 @@
"*"
],
"append_dimensions": {
"name": "{{ name }}",
"server_type": "{{ server_type }}"
{% if ec2.tags['server-type'] is defined %}
"name": "{{ ec2.tags['Name'] }}",
"server_type": "{{ ec2.tags['server-type'] }}"
{% else %}
"name": "{{ ec2.tags['Name'] }}"
{% endif %}
}
},
"mem": {
Expand All @@ -64,8 +76,12 @@
"*"
],
"append_dimensions": {
"name": "{{ name }}",
"server_type": "{{ server_type }}"
{% if ec2.tags['server-type'] is defined %}
"name": "{{ ec2.tags['Name'] }}",
"server_type": "{{ ec2.tags['server-type'] }}"
{% else %}
"name": "{{ ec2.tags['Name'] }}"
{% endif %}
}
},
"swap": {
Expand All @@ -77,11 +93,15 @@
"*"
],
"append_dimensions": {
"name": "{{ name }}",
"server_type": "{{ server_type }}"
{% if ec2.tags['server-type'] is defined %}
"name": "{{ ec2.tags['Name'] }}",
"server_type": "{{ ec2.tags['server-type'] }}"
{% else %}
"name": "{{ ec2.tags['Name'] }}"
{% endif %}
}
}
}
}
},
"logs": {
"logs_collected": {
Expand All @@ -90,15 +110,15 @@
{
"file_path": "/var/log/messages",
"log_group_name": "cwagent-var-log-messages",
"log_stream_name": "{{ instance_id }}"
"log_stream_name": "{{ ansible_ec2_instance_id }}"
},
{
"file_path": "/var/log/secure",
"log_group_name": "cwagent-var-log-secure",
"log_stream_name": "{{ instance_id }}"
"log_stream_name": "{{ ansible_ec2_instance_id }}"
}
]
}
}
}
}
}
5 changes: 0 additions & 5 deletions ansible/roles/amazon-cloudwatch-agent/vars/main.yml

This file was deleted.

Loading

0 comments on commit 88d682c

Please sign in to comment.