-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metric verification job #144
base: master
Are you sure you want to change the base?
Changes from all commits
5258e57
51c0c64
c1d6e7c
d8ce03d
03f2f40
3c9048a
ae12bc9
9ff5b91
582803a
d35988e
10da834
4d5c162
3956344
b5f2737
0e625a0
0970e75
90ee44f
88281a4
432cf9c
7aec434
9fb1f0c
464a8cd
8abb4ab
c285028
27ee240
de8dfba
670fd09
668630a
33b1b36
e9c32cb
ba63e98
e5313a6
fde1c8e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
--- | ||
- name: Run telemetry tests to verify metrics on osp18 | ||
hosts: "{{ cifmw_target_hook_host | default('localhost') }}" | ||
gather_facts: true | ||
environment: | ||
KUBECONFIG: "{{ cifmw_openshift_kubeconfig }}" | ||
PATH: "{{ cifmw_path }}" | ||
vars_files: | ||
- vars/common.yml | ||
- vars/osp18_env.yml | ||
tasks: | ||
- name: Include vars from the extra_vars files | ||
ansible.builtin.include_vars: | ||
dir: "{{ cifmw_basedir }}/artifacts/parameters" | ||
|
||
- name: Patch observabilityclient into openstackclient | ||
ansible.builtin.shell: | ||
cmd: | | ||
oc exec openstackclient -- python3 -m ensurepip --upgrade | ||
oc exec openstackclient -- python3 -m pip install --upgrade aodhclient | ||
oc exec openstackclient -- python3 -m pip install --upgrade python-observabilityclient | ||
when: patch_observabilityclient | bool | ||
tags: | ||
- setup | ||
|
||
- name: "Run Telemetry Verify Metrics tests" | ||
ansible.builtin.import_role: | ||
name: telemetry_verify_metrics |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
post_deploy_00_fvt_verify_metrics: | ||
source: "{{ ansible_user_dir }}/{{ zuul.projects['github.com/infrawatch/feature-verification-tests'].src_dir }}/ci/run_verify_metrics_osp18.yml" | ||
type: playbook | ||
config_file: "{{ ansible_user_dir }}/{{ zuul.projects['github.com/infrawatch/feature-verification-tests'].src_dir }}/ci/ansible.cfg" | ||
post_deploy_99_collect_results: | ||
source: "{{ ansible_user_dir }}/{{ zuul.projects['github.com/infrawatch/feature-verification-tests'].src_dir }}/ci/report_result.yml" | ||
type: playbook |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
- name: Verify that a CR exists {{ common_cr_test_id }} | ||
ansible.builtin.command: | ||
cmd: | | ||
oc get {{ item.kind }} {{ item.name }} | ||
register: result | ||
changed_when: false | ||
failed_when: | ||
- result.rc != 0 | ||
|
||
- name: Verify that a CR is ready {{ common_cr_ready_test_id }} | ||
ansible.builtin.command: | ||
cmd: | | ||
oc get {{ item.kind }} {{ item.name }} -o jsonpath='{.status.conditions[?(@.type=="{{ item.condition_type }}")].status}{"\n"}' | ||
register: result | ||
changed_when: false | ||
failed_when: | ||
- result.stdout != "True" | ||
when: | ||
- common_cr_ready_test_id is defined | ||
- item.condition_type is defined | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nicely done |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,10 +35,18 @@ | |
ansible.builtin.include_tasks: "crd_tests.yml" | ||
loop: "{{ common_crd_list }}" | ||
|
||
- name: "Run CR tests" | ||
when: | ||
- common_cr_test_id is defined | ||
- common_cr_list is defined | ||
ansible.builtin.include_tasks: "cr_tests.yml" | ||
loop: "{{ common_cr_list }}" | ||
|
||
- name: "Verify container tests" | ||
when: | ||
- common_container_list is defined | ||
- common_container_test_id is defined | ||
ansible.builtin.include_tasks: "container_test.yml" | ||
loop: "{{ common_container_list }}" | ||
|
||
loop_control: | ||
loop_var: container_name | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very nice, it is a stylistic change, but enhances readability There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wasn't my idea. Ansible complained about "item" being redefined 🤣 |
vyzigold marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
telemetry_verify_metrics | ||
========= | ||
|
||
Test that expected metrics appear in Prometheus | ||
|
||
Requirements | ||
------------ | ||
OpenStack deployed with the following enabled: | ||
- telemetry | ||
- metricstorage | ||
- ceilometer | ||
- rabbitmq | ||
|
||
Tests: | ||
------ | ||
- Verify OpenStack is deployed correctly | ||
- Verify telemetry is ready | ||
- Verify metricstorage is ready | ||
- Verify ceilometer is ready | ||
- Verify rabbitmq is ready | ||
- Verify RabbitMQ metrics are being exposed and stored | ||
- Check the rabbitmq metrics endpoint | ||
- Use openstack observabilityclient to verify RabbitMQ metrics are stored in Prometheus | ||
- Verify Ceilometer metrics are being exposed and stored | ||
- Use openstack observabilityclient to verify Ceilometer central metrics are stored in Prometheus | ||
- Use openstack observabilityclient to verify Ceilometer compute metrics are stored in Prometheus | ||
- Verify NodeExporter metrics are being exposed and stored | ||
- Use openstack observabilityclient to verify NodeExporter metrics are stored in Prometheus | ||
|
||
Role Variables | ||
-------------- | ||
openstack\_cmd - command to access openstack cli. For example: "oc rsh openstackclient openstack" | ||
vyzigold marked this conversation as resolved.
Show resolved
Hide resolved
|
||
telemetry\_verify\_metrics\_metric\_sources\_to\_test - List of sources to test. Current set of possible sources: ceilometer\_compute\_agent, ceilometer\_central\_agent, node\_exporter, rabbitmq | ||
|
||
Example Playbook | ||
---------------- | ||
- name: Run telemetry tests to verify metrics on osp18 | ||
hosts: "{{ cifmw\_target\_hook\_host | default('localhost') }}" | ||
gather\_facts: true | ||
environment: | ||
KUBECONFIG: "path to kubeconfig" | ||
PATH: "PATH variable contents" | ||
tasks | ||
- name: "Run Telemetry Verify Metrics tests" | ||
ansible.builtin.import_role: | ||
name: telemetry_verify_metrics | ||
|
||
License | ||
------- | ||
|
||
Apache 2 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
--- | ||
telemetry_verify_metrics_metric_sources_to_test: | ||
- ceilometer_compute_agent | ||
- ceilometer_central_agent | ||
# Disable node exporter testing until OSPRH-11059 is fixed | ||
# - node_exporter | ||
- rabbitmq |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
galaxy_info: | ||
author: Jaromir Wysoglad | ||
description: Test that metrics from all sources are stored in Prometheus | ||
company: Red Hat | ||
|
||
# If the issue tracker for your role is not on github, uncomment the | ||
# next line and provide a value | ||
# issue_tracker_url: http://example.com/issue/tracker | ||
|
||
# Choose a valid license ID from https://spdx.org - some suggested licenses: | ||
# - BSD-3-Clause (default) | ||
# - MIT | ||
# - GPL-2.0-or-later | ||
# - GPL-3.0-only | ||
# - Apache-2.0 | ||
# - CC-BY-4.0 | ||
license: Apache-2.0 | ||
|
||
min_ansible_version: "2.1" | ||
|
||
# If this a Container Enabled role, provide the minimum Ansible Container version. | ||
# min_ansible_container_version: | ||
|
||
# | ||
# Provide a list of supported platforms, and for each platform a list of versions. | ||
# If you don't wish to enumerate all versions for a particular platform, use 'all'. | ||
# To view available platforms and versions (or releases), visit: | ||
# https://galaxy.ansible.com/api/v1/platforms/ | ||
# | ||
# platforms: | ||
# - name: Fedora | ||
# versions: | ||
# - all | ||
# - 25 | ||
# - name: SomePlatform | ||
# versions: | ||
# - all | ||
# - 1.0 | ||
# - 7 | ||
# - 99.99 | ||
|
||
galaxy_tags: [] | ||
# List tags for your role here, one per line. A tag is a keyword that describes | ||
# and categorizes the role. Users find roles by searching for tags. Be sure to | ||
# remove the '[]' above, if you add tags to this list. | ||
# | ||
# NOTE: A tag is limited to a single word comprised of alphanumeric characters. | ||
# Maximum 20 tags per role. | ||
|
||
dependencies: [] | ||
# List your role dependencies here, one per line. Be sure to remove the '[]' above, | ||
# if you add dependencies to this list. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
- delegate_to: "{{ compute_node }}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm still not a fan of having this in the role, but I'm not going to block based on this. I hope you'll later consider whether this is better being included in the ci/run_verify_metrics_osp18.yml playbook, as a separate play for compute and controller node tests. |
||
# The containers on compute nodes seem to run on the root user, so we need to connect as root | ||
become: true | ||
block: | ||
- name: Check compute node containers are up for {{ compute_node }} | ||
ansible.builtin.include_role: | ||
name: common |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
--- | ||
- name: Verify OpenStack is deployed correctly | ||
ansible.builtin.include_role: | ||
name: common | ||
vars: | ||
common_cr_test_id: RHOSO-1258 | ||
common_cr_ready_test_id: RHOSO-1259 | ||
common_cr_list: | ||
- kind: telemetry | ||
name: telemetry | ||
condition_type: Ready | ||
Comment on lines
+9
to
+11
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like the use of dicts here, which makes the test configs more readable than using lists. |
||
- kind: metricstorage | ||
name: metric-storage | ||
condition_type: Ready | ||
- kind: ceilometer | ||
name: ceilometer | ||
condition_type: Ready | ||
- kind: rabbitmq | ||
name: rabbitmq | ||
condition_type: ReconcileSuccess | ||
- kind: rabbitmq | ||
name: rabbitmq-cell1 | ||
condition_type: ReconcileSuccess | ||
tags: precheck | ||
|
||
- name: Verify RabbitMQ metrics are being exposed and stored | ||
ansible.builtin.include_tasks: | ||
file: verify_rabbitmq_metrics.yml | ||
tags: test | ||
when: '"rabbitmq" in telemetry_verify_metrics_metric_sources_to_test' | ||
|
||
- name: Verify Ceilometer compute metrics are being exposed and stored | ||
ansible.builtin.include_tasks: | ||
file: verify_ceilometer_compute_metrics.yml | ||
tags: test | ||
when: '"ceilometer_compute_agent" in telemetry_verify_metrics_metric_sources_to_test' | ||
|
||
- name: Verify Ceilometer central metrics are being exposed and stored | ||
ansible.builtin.include_tasks: | ||
file: verify_ceilometer_central_metrics.yml | ||
tags: test | ||
when: '"ceilometer_central_agent" in telemetry_verify_metrics_metric_sources_to_test' | ||
|
||
- name: Verify NodeExporter metrics are being exposed and stored | ||
ansible.builtin.include_tasks: | ||
file: verify_node_exporter_metrics.yml | ||
tags: test | ||
when: '"node_exporter" in telemetry_verify_metrics_metric_sources_to_test' |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
- name: Verify ceilometer scrapeconfig exists | ||
ansible.builtin.include_role: | ||
name: common | ||
vars: | ||
common_cr_test_id: RHOSO-1220 | ||
common_cr_list: | ||
- kind: scrapeconfigs.monitoring.rhobs | ||
name: telemetry-ceilometer | ||
|
||
- name: Verify ceilometer central agent is running | ||
ansible.builtin.include_role: | ||
name: common | ||
vars: | ||
common_pod_test_id: RHOSO-1240 | ||
common_pod_status_str: "Running" | ||
common_pod_nspace: openstack | ||
common_pod_list: | ||
- ceilometer-0 | ||
|
||
- block: | ||
- name: Create an image | ||
ansible.builtin.shell: | | ||
curl -L -# http://download.cirros-cloud.net/0.5.2/cirros-0.5.2-x86_64-disk.img > /tmp/fvt_testing_image.img | ||
{{ openstack_cmd }} image create --container-format bare --disk-format qcow2 fvt_central_testing_image < /tmp/fvt_testing_image.img | ||
register: result | ||
changed_when: result.rc == 0 | ||
failed_when: result.rc >= 1 | ||
|
||
- name: | | ||
TEST Use openstack observabilityclient to verify ceilometer central metrics are stored in prometheus | ||
RHOSO-1212 | ||
ansible.builtin.shell: | | ||
{{ openstack_cmd }} metric show ceilometer_image_size | ||
register: result | ||
delay: 30 | ||
retries: 10 | ||
until: result.rc == 0 and "ceilometer_image_size" in result.stdout | ||
changed_when: false | ||
|
||
always: | ||
- name: Delete the image | ||
ansible.builtin.shell: | | ||
{{ openstack_cmd }} image show fvt_central_testing_image && {{ openstack_cmd }} image delete fvt_central_testing_image | ||
register: result | ||
changed_when: result.rc == 0 | ||
failed_when: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole file feels a little hacky after my changes. The thing is, that the task for getting container status needs to run on the compute nodes. I'd like to run this from the telemetry_verify_metrics role to for example verify the node exporter container is healthy before checking if we're getting metrics from it. I don't think I can set the hosts for one task. Any other ideas other than what I have here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use
delegate_to
.OR add the check to the verify metrics job using a second play in the playbook.