This repository contains step by step demonstration to set up monitoring Stack for Amazon OpenSearch Service domains and Amazon OpenSearch Serverless collections across all specified regions. This example uses AWS CDK and Python.
- Context
- Prerequisites
- Deploy
- OpenSearch Subscription Filters
- Pre-built Monitoring Dashboards
- Pre-built Alerts
- Clean up
- Total Cost of Ownership
Amazon OpenSearch Service is a fully managed service that makes it easy for you to deploy, secure, and run OpenSearch cost effectively at scale. Customers often have an issue to manage and monitor multiple Amazon OpenSearch Service domains and OpenSearch Serverless collection as those metrics and logs are not available at centralized place for troubleshooting the issue. This example helps you to configure a monitoring for Amazon OpenSearch Service domains, and OpenSearch Serverless collections which will fetch the Cloudwatch Metrics and Cloudwatch logs from all domains/collections at a regular interval. This example also comes with pre-built OpenSearch dashboards and Alerts.
The following tools are required to deploy this Monitoring tool for Amazon OpenSearch Service.
AWS CDK - https://docs.aws.amazon.com/cdk/latest/guide/getting_started.html
AWS CLI - https://aws.amazon.com/cli/
Git - https://git-scm.com/downloads
nodejs - https://nodejs.org/en
python (3.6 or later) - https://www.python.org/downloads/
Complete the following steps to set up the Amazon OpenSearch Service Monitor tool in your environment using CDK.
At a bash terminal session.
# clone the repo
$ git clone https://github.com/aws-samples/amazon-opensearch-service-monitor.git
# move to directory
$ cd amazon-opensearch-service-monitor
# bootstrap the remaining setup (assumes us-west-2)
# Enter the e-mail address for alert, as that will be used for sending the alert
# Alternatively you can change e-mail address manually in opensearch/opensearch_monitor_stack.py
$ bash bootstrap.sh
# activate the virtual environment
$ source .env/bin/activate
Create the CDK configuration by bootstrapping the CDK (one-time activity for each region).
# bootstrap the cdk
(.env)$ cdk bootstrap aws://yourAccountID/yourRegion
Use the AWS CDK to deploy opensearch stack for Amazon OpenSearch Service. This stack comprises of creating/deploying below components:
- Create VPC with 3 AZ
- Create and launch Amazon OpenSearch Service cluster (version 2.3) having two t3.medium data nodes with 100GB of EBS storage volume. These 2 nodes are spread across 2 different AZ's
- Create Dynamo DB table for timestamp tracking
- Create lambda function to fetch Cloudwatch metrics across all regions and all domains. By default it fetches the data every 5 min, which can be changed if needed.
- Create and launch an EC2 instance which acts as SSH tunnel to access dashboards, as all of our setup is secured and in VPC
- Create default OpenSearch dashboards to visualize metrics across all domains and collections
- Create and setup default e-mail alerts to newly launched Amazon OpenSearch Service cluster
- Create Index template and Index State Management (ISM) policy to delete indices older than 366 days. (can be changed to different retention if needed)
- Monitoring stack has an option to enable Ultra Warm (UW) which is disabled by default, Change settings in this file to enable UW.
- Create lambda function to fetch Cloudwatch metrics and Cloudwatch logs across all regions.
Note: Complete stack gets setup with pre-defined configuration defined in opensearch_monitor_stack.py, please review the settings such as e-mail, instance type, username, password before proceeding to deploy. You can also enable UW and dedicated master (if needed)
Run below command
(.env)$ cdk deploy
The CDK will prompt to apply Security Changes, input "y" for Yes.
Once the app is deployed you will get the Dashboards URL, user and password to access OpenSearch Dashboards. Once logged in you can refer below sections to navigate around dashboards and alerts.
Note: After the stack is deployed you will recieve an e-mail to confirm the subscription, please confirm the same to start getting the alerts.
Once stack is deployed successfully you need to create subscription filter and assign them to Lambda. Run setupCWSubscriptionFilter.py to create the subscription filter (assuming the CW log groups with prefix as /aws/aes/domains), if there is any change in prefix please make sure to change above file before running the steps as below.
(.env)$ python3 opensearch/setupCWSubscriptionFilter.py deploy
Monitoring domain comes with pre-built dashboards for OpenSearch Service and OpenSearch Serverless collection metrics, these dashboards can be accessed as below:
-
Login to Dashboards: Access OpenSearch Dashboards with an IP obtained after the deployment and login as below
-
Once logged in, select the private tenant from the pop up and then select dashboard as shown below
-
After clicking on dashboard, it displays list of the dashboard which comes as default
-
Domain Metrics : This gives a 360 degree view of all Amazon OpenSearch Service domains across the regions.
-
Domain Overview : This gives a more detailed metrics for a particular domain, could help to deep dive for issues into a specific domain.
-
Serverless Collection Metrics : This gives a 360 degree view of all Amazon OpenSearch Serverless collections across the regions.
-
Serverless Collection Overview : This gives a more detailed metrics for a particular collection, could help to deep dive for issues into a specific collection.
Monitoring domains comes with pre-built alerts as below, which could help to get notified as an email alert for event such as Cluster Health, Disk Issue, Memory Issue , JVM issue etc. These alerts are built for OpenSearch Service domains metrics.
Alert Type | Frequency |
---|---|
Cluster Health - Red | 5 Min |
Cluster Index Writes Blocked | 5 Min |
Automated Snapshot Failure | 5 Min |
JVM Memory Pressure > 80% | 5 Min |
CPU Utilization > 80% | 15 Min |
No Kibana Healthy Nodes | 15 Min |
No Dashboards Healthy Nodes | 15 Min |
Invalid Host Header Requests | 15 Min |
Cluster Health - Yellow | 30 Min |
To clean up the stacks. destroy the opensearch stack, all other stacks will be torn down due to dependencies.
(.env)$ cdk destroy
To remove subscription for Cloudwatch logs run the script as below. This will traverse the Amazon OpenSearch Service cloudwatch logs and delete any filter which has been created during the deploy.
(.env)$ python3 opensearch/setupCWSubscriptionFilter.py destroy
Running this solution will incur charges of less than $10 per day for one domain with additional $2 per day for each additional domain.
If you encounter a bug, please create a new issue with as much detail as possible and steps for reproducing the bug. See the Contributing Guidelines for more details.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.