Are you ready to take your AWS RDS monitoring to the next level? Say hello to prometheus-rds-exporter, your ultimate solution for comprehensive, real-time insights into your Amazon RDS instances!
Built by SRE Engineers, designed for production: Meticulously crafted by a team of Site Reliability Engineers with years of hands-on experience in managing RDS production systems. Trust in their expertise to supercharge your monitoring.
It collects key metrics about:
- Hardware resource usage
- Underlying EC2 instance's hard limits
- Pending AWS RDS maintenance operations
- Pending modifications
- Logs size
- RDS quota usage information
Tip
Prometheus RDS exporter is part of the Database Monitoring Framework which provides alerts, along with their handy runbooks for AWS RDS.
🥇 Advanced Metrics: Gain deep visibility with advanced metrics for AWS RDS. Monitor performance, query efficiency, and resource utilization like never before.
🧩 AWS Quotas Insights: Stay in control with real-time information about AWS quotas. Ensure you never hit limits unexpectedly.
💡 Hard Limits visibility: Know the hard limits of the EC2 instance used by RDS and manage your resources effectively.
🔔 Alerting at Your Fingertips: Easily set up Prometheus alerting rules to stay informed of critical events, ensuring you're always ahead of issues.
🛠️ Simple Setup: Getting started is a breeze! Our clear documentation and examples will have you up and running in no time.
📊 Dashboards: Prometheus-RDS Exporter export adopts the USE methodology and provides well-designed, ready-to-use dashboards.
🌐 Community-Driven: Join a vibrant community of users and contributors. Collaborate, share knowledge, and shape the future of AWS RDS monitoring together.
🚀 When combined with prometheus-community/postgres_exporter, it provides a production-ready monitoring framework for RDS PostgreSQL.
Name | Labels | Description |
---|---|---|
rds_allocated_disk_iops_average | aws_account_id , aws_region , dbidentifier |
Allocated disk IOPS |
rds_allocated_disk_throughput_bytes | aws_account_id , aws_region , dbidentifier |
Allocated disk throughput |
rds_allocated_storage_bytes | aws_account_id , aws_region , dbidentifier |
Allocated storage |
rds_api_call_total | api , aws_account_id , aws_region |
Number of call to AWS API |
rds_backup_retention_period_seconds | aws_account_id , aws_region , dbidentifier |
Automatic DB snapshots retention period |
rds_ca_certificate_valid_until | aws_account_id , aws_region , dbidentifier |
Timestamp of the expiration of the Instance certificate |
rds_cpu_usage_percent_average | aws_account_id , aws_region , dbidentifier |
Instance CPU used |
rds_database_connections_average | aws_account_id , aws_region , dbidentifier |
The number of client network connections to the database instance |
rds_dbload_average | aws_account_id , aws_region , dbidentifier |
Number of active sessions for the DB engine |
rds_dbload_cpu_average | aws_account_id , aws_region , dbidentifier |
Number of active sessions where the wait event type is CPU |
rds_dbload_noncpu_average | aws_account_id , aws_region , dbidentifier |
Number of active sessions where the wait event type is not CPU |
rds_exporter_build_info | build_date , commit_sha , version |
A metric with constant '1' value labeled by version from which exporter was built |
rds_exporter_errors_total | Total number of errors encountered by the exporter | |
rds_free_storage_bytes | aws_account_id , aws_region , dbidentifier |
Free storage on the instance |
rds_freeable_memory_bytes | aws_account_id , aws_region , dbidentifier |
Amount of available random access memory. For MariaDB, MySQL, Oracle, and PostgreSQL DB instances, this metric reports the value of the MemAvailable field of /proc/meminfo |
rds_instance_age_seconds | aws_account_id , aws_region , dbidentifier |
Time since instance creation |
rds_instance_baseline_iops_average | aws_account_id , aws_region , instance_class |
Baseline IOPS of underlying EC2 instance class |
rds_instance_baseline_throughput_bytes | aws_account_id , aws_region , instance_class |
Baseline throughput of underlying EC2 instance class |
rds_instance_info | arn , aws_account_id , aws_region , dbi_resource_id , dbidentifier , deletion_protection , engine , engine_version , instance_class , multi_az , performance_insights_enabled , pending_maintenance , pending_modified_values , role , source_dbidentifier , storage_type , ca_certificate_identifier |
RDS instance information |
rds_instance_log_files_size_bytes | aws_account_id , aws_region , dbidentifier |
Total of log files on the instance |
rds_instance_max_iops_average | aws_account_id , aws_region , instance_class |
Maximum IOPS of underlying EC2 instance class |
rds_instance_max_throughput_bytes | aws_account_id , aws_region , instance_class |
Maximum throughput of underlying EC2 instance class |
rds_instance_memory_bytes | aws_account_id , aws_region , instance_class |
Instance class memory |
rds_instance_status | aws_account_id , aws_region , dbidentifier |
Instance status (1: ok, 0: can't scrap metrics) |
rds_instance_tags | aws_account_id , aws_region , dbidentifier , tag_<AWS_TAG> ... |
AWS tags attached to the instance |
rds_instance_vcpu_average | aws_account_id , aws_region , instance_class |
Total vCPU for this instance class |
rds_max_allocated_storage_bytes | aws_account_id , aws_region , dbidentifier |
Upper limit in gibibytes to which Amazon RDS can automatically scale the storage of the DB instance |
rds_max_disk_iops_average | aws_account_id , aws_region , dbidentifier |
Max disk IOPS evaluated with disk IOPS and EC2 capacity |
rds_max_storage_throughput_bytes | aws_account_id , aws_region , dbidentifier |
Max disk throughput evaluated with disk throughput and EC2 capacity |
rds_maximum_used_transaction_ids_average | aws_account_id , aws_region , dbidentifier |
Maximum transaction IDs that have been used. Applies to only PostgreSQL |
rds_quota_max_dbinstances_average | aws_account_id , aws_region |
Maximum number of RDS instances allowed in the AWS account |
rds_quota_maximum_db_instance_snapshots_average | aws_account_id , aws_region |
Maximum number of manual DB instance snapshots |
rds_quota_total_storage_bytes | aws_account_id , aws_region |
Maximum total storage for all DB instances |
rds_read_iops_average | aws_account_id , aws_region , dbidentifier |
Average number of disk read I/O operations per second |
rds_read_throughput_bytes | aws_account_id , aws_region , dbidentifier |
Average number of bytes read from disk per second |
rds_replica_lag_seconds | aws_account_id , aws_region , dbidentifier |
For read replica configurations, the amount of time a read replica DB instance lags behind the source DB instance. Applies to MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL read replicas |
rds_replication_slot_disk_usage_bytes | aws_account_id , aws_region , dbidentifier |
Disk space used by replication slot files. Applies to PostgreSQL |
rds_swap_usage_bytes | aws_account_id , aws_region , dbidentifier |
Amount of swap space used on the DB instance. This metric is not available for SQL Server |
rds_transaction_logs_disk_usage_bytes | aws_account_id , aws_region , dbidentifier |
Disk space used by transaction logs (only on PostgreSQL) |
rds_usage_allocated_storage_bytes | aws_account_id , aws_region |
Total storage used by AWS RDS instances |
rds_usage_db_instances_average | aws_account_id , aws_region |
AWS RDS instance count |
rds_usage_manual_snapshots_average | aws_account_id , aws_region |
Manual snapshots count |
rds_write_iops_average | aws_account_id , aws_region , dbidentifier |
Average number of disk write I/O operations per second |
rds_write_throughput_bytes | aws_account_id , aws_region , dbidentifier |
Average number of bytes written to disk per second |
up | Was the last scrape of RDS successful |
Standard Go and Prometheus metrics are also available
Name | Labels | Description |
---|---|---|
go_gc_duration_seconds | quantile |
A summary of the pause duration of garbage collection cycles. |
go_goroutines | Number of goroutines that currently exist. | |
go_info | version |
Information about the Go environment. |
go_memstats_alloc_bytes | Number of bytes allocated and still in use. | |
go_memstats_alloc_bytes_total | Total number of bytes allocated, even if freed. | |
go_memstats_buck_hash_sys_bytes | Number of bytes used by the profiling bucket hash table. | |
go_memstats_frees_total | Total number of frees. | |
go_memstats_gc_sys_bytes | Number of bytes used for garbage collection system metadata. | |
go_memstats_heap_alloc_bytes | Number of heap bytes allocated and still in use. | |
go_memstats_heap_idle_bytes | Number of heap bytes waiting to be used. | |
go_memstats_heap_inuse_bytes | Number of heap bytes that are in use. | |
go_memstats_heap_objects | Number of allocated objects. | |
go_memstats_heap_released_bytes | Number of heap bytes released to OS. | |
go_memstats_heap_sys_bytes | Number of heap bytes obtained from system. | |
go_memstats_last_gc_time_seconds | Number of seconds since 1970 of last garbage collection. | |
go_memstats_lookups_total | Total number of pointer lookups. | |
go_memstats_mallocs_total | Total number of mallocs. | |
go_memstats_mcache_inuse_bytes | Number of bytes in use by mcache structures. | |
go_memstats_mcache_sys_bytes | Number of bytes used for mcache structures obtained from system. | |
go_memstats_mspan_inuse_bytes | Number of bytes in use by mspan structures. | |
go_memstats_mspan_sys_bytes | Number of bytes used for mspan structures obtained from system. | |
go_memstats_next_gc_bytes | Number of heap bytes when next garbage collection will take place. | |
go_memstats_other_sys_bytes | Number of bytes used for other system allocations. | |
go_memstats_stack_inuse_bytes | Number of bytes in use by the stack allocator. | |
go_memstats_stack_sys_bytes | Number of bytes obtained from system for stack allocator. | |
go_memstats_sys_bytes | Number of bytes obtained from system. | |
go_threads | Number of OS threads created. | |
promhttp_metric_handler_requests_in_flight | Current number of scrapes being served. | |
promhttp_metric_handler_requests_total | code |
Total number of scrapes by HTTP status code. |
Tip
If you deploy Grafana operator in your Kubernetes cluster, dashboards could be automatically deployed and maintained up-to-date.
Set dashboards.enabled: true
in your Helm deployment to deploy dashboards as GrafanaDashboard CRD
Why are we recommending Grafana operator?
We are committed to providing you with the most efficient and user-friendly experience possible. Therefore, we continuously enhance our dashboards and the metrics produced by our exporters to ensure you have access to the most accurate and relevant data.
To ensure an optimal user experience, it's vital to keep your dashboards up to date. This practice guarantees that you are always working with the latest features and improvements, enabling you to make the most out of the data presented to you. However, maintaining multiple versions of dashboards can be challenging and is not desirable. It introduces complexity and can lead to inconsistencies between what you see and the actual data.
By leveraging the Grafana Operator, you can rest assured that the version of your dashboard will always match the metrics presented by your exporter. This synchronization between your dashboards and the underlying data ensures a seamless and accurate monitoring experience. This move towards operator-based deployment is designed to streamline your monitoring process, ensuring accuracy and efficiency in your data visualization efforts.
Kubernetes operators aim to simplify deployments, and as part of this evolution, we will eventually stop publishing dashboards on Grafana Labs.
For convenience, dashboards are also available in configs/grafana/public/
folder and Grafana labs:
Configuration could be defined in prometheus-rds-exporter.yaml or environment variables (format PROMETHEUS_RDS_EXPORTER_<PARAMETER_NAME>
).
Parameter | Description | Default |
---|---|---|
aws-assume-role-arn | AWS IAM ARN role to assume to fetch metrics | |
aws-assume-role-session | AWS assume role session name | prometheus-rds-exporter |
collect-instance-metrics | Collect AWS instances metrics (AWS Cloudwatch API) | true |
collect-instance-tags | Collect AWS RDS tags | true |
collect-instance-types | Collect AWS instance types information (AWS EC2 API) | true |
collect-logs-size | Collect AWS instances logs size (AWS RDS API) | true |
collect-maintenances | Collect AWS instances maintenances (AWS RDS API) | true |
collect-quotas | Collect AWS RDS quotas (AWS quotas API) | true |
collect-usages | Collect AWS RDS usages (AWS Cloudwatch API) | true |
debug | Enable debug mode | |
enable-otel-traces | Enable OpenTelemetry traces. See configuration | false |
listen-address | Address to listen on for web interface | :9043 |
log-format | Log format (text or json ) |
json |
metrics-path | Path under which to expose metrics | /metrics |
tls-cert-path | Path to TLS certificate | |
tls-key-path | Path to private key for TLS |
Configuration parameters priorities:
$HOME/prometheus-rds-exporter.yaml
fileprometheus-rds-exporter.yaml
file- Environment variables
- Command line flags
Prometheus RDS exporter needs read-only AWS IAM permissions to fetch metrics from AWS RDS, CloudWatch, EC2 and ServiceQuota AWS APIs.
Standard AWS authentication methods (AWS credentials, SSO and assumed role) are supported, see https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html.
If you are running on AWS EKS, we strongly recommend to use IRSA
Minimal required IAM permissions
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowInstanceAndLogDescriptions",
"Effect": "Allow",
"Action": [
"rds:DescribeDBInstances",
"rds:DescribeDBLogFiles"
],
"Resource": [
"arn:aws:rds:*:*:db:*"
]
},
{
"Sid": "AllowMaintenanceDescriptions",
"Effect": "Allow",
"Action": [
"rds:DescribePendingMaintenanceActions"
],
"Resource": "*"
},
{
"Sid": "AllowGettingCloudWatchMetrics",
"Effect": "Allow",
"Action": [
"cloudwatch:GetMetricData"
],
"Resource": "*"
},
{
"Sid": "AllowRDSUsageDescriptions",
"Effect": "Allow",
"Action": [
"rds:DescribeAccountAttributes"
],
"Resource": "*"
},
{
"Sid": "AllowQuotaDescriptions",
"Effect": "Allow",
"Action": [
"servicequotas:GetServiceQuota"
],
"Resource": "*"
},
{
"Sid": "AllowInstanceTypeDescriptions",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstanceTypes"
],
"Resource": "*"
}
]
}
For convenience, you can download it using:
curl \
--fail \
--silent \
--write-out "Reponse code: %{response_code}\n" \
https://raw.githubusercontent.com/qonto/prometheus-rds-exporter/main/configs/aws/policy.json \
-o /tmp/prometheus-rds-exporter.policy.json
Terraform users can take example on Terraform code in configs/terraform/
.
We recommend deployment using helm Helm.
See all available configuration parameters in configs/helm/values.yaml
See the Development environment to start the Prometheus RDS exporter, Prometheus, and Grafana with dashboards in a minute.
Note
You use Istio and have Prometheus within Istio-system? Do this first.
Recommended method to deploy on AWS EKS using IRSA and Helm.
Important
You need a Prometheus Operator already installed in your cluster.
-
Create an IAM policy
IAM_POLICY_NAME=prometheus-rds-exporter # Download policy payload curl --fail --silent --write-out "Reponse code: %{response_code}\n" https://raw.githubusercontent.com/qonto/prometheus-rds-exporter/main/configs/aws/policy.json -o /tmp/prometheus-rds-exporter.policy.json # Create IAM policy aws iam create-policy --policy-name ${IAM_POLICY_NAME} --policy-document file:///tmp/prometheus-rds-exporter.policy.json
-
Create and attach an IAM role to your EKS cluster
eksctl will create an IAM role and a Kubernetes Service account
EKS_CLUSTER_NAME=default # Replace with your EKS cluster name KUBERNETES_NAMESPACE=monitoring # Replace with namespace of your choice IAM_ROLE_NAME=prometheus-rds-exporter KUBERNETES_SERVICE_ACCOUNT_NAME=prometheus-rds-exporter AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text) eksctl \ create iamserviceaccount \ --cluster ${EKS_CLUSTER_NAME} \ --namespace ${KUBERNETES_NAMESPACE} \ --name ${KUBERNETES_SERVICE_ACCOUNT_NAME} \ --role-name ${IAM_ROLE_NAME} \ --attach-policy-arn arn:aws:iam::${AWS_ACCOUNT_ID}:policy/${IAM_POLICY_NAME} \ --approve
-
Deploy the exporter
PROMETHEUS_RDS_EXPORTER_VERSION=0.3.0 # Replace with latest version SERVICE_ACCOUNT_ANNOTATION="arn:aws:iam::${AWS_ACCOUNT_ID}:role/${IAM_ROLE_NAME}" helm upgrade \ prometheus-rds-exporter \ oci://public.ecr.aws/qonto/prometheus-rds-exporter-chart \ --version ${PROMETHEUS_RDS_EXPORTER_VERSION} \ --install \ --namespace ${KUBERNETES_NAMESPACE} \ --set serviceAccount.annotations."eks\.amazonaws\.com\/role-arn"="${SERVICE_ACCOUNT_ANNOTATION}" \ --set serviceAccount.name="${IAM_ROLE_NAME}"
-
Option. Customize Prometheus exporter settings
Download Helm chart default values
helm show values oci://public.ecr.aws/qonto/prometheus-rds-exporter-chart --version ${PROMETHEUS_RDS_EXPORTER_VERSION} > values.yaml
Customize settings
vim values.yaml
Example to enable debug via PROMETHEUS_RDS_EXPORTER_DEBUG environment variable
yq --inplace '.env += {"PROMETHEUS_RDS_EXPORTER_DEBUG": "true"}' values.yaml
Update Helm deployment:
helm upgrade \ prometheus-rds-exporter \ oci://public.ecr.aws/qonto/prometheus-rds-exporter-chart \ --version ${PROMETHEUS_RDS_EXPORTER_VERSION} \ --install \ --namespace ${KUBERNETES_NAMESPACE} \ --set serviceAccount.annotations."eks\.amazonaws\.com\/role-arn"="${SERVICE_ACCOUNT_ANNOTATION}" \ --set serviceAccount.name="${IAM_ROLE_NAME}" \ --values values.yaml
-
Grant IAM permissions to the EC2 instance
See steps
-
Create IAM role
IAM_ROLE_NAME=prometheus-rds-exporter cat > ec2-role-trust-policy.json << EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com"}, "Action": "sts:AssumeRole" } ] } EOF aws iam create-role --role-name ${IAM_ROLE_NAME} --assume-role-policy-document file://ec2-role-trust-policy.json
-
Create IAM policy
IAM_POLICY_NAME=prometheus-rds-exporter # Download Prometheus RDS exporter required IAM permissions curl --fail --silent --write-out "Reponse code: %{response_code}\n" https://raw.githubusercontent.com/qonto/prometheus-rds-exporter/main/configs/aws/policy.json -o prometheus-rds-exporter.policy.json # Create IAM policy aws iam create-policy --policy-name ${IAM_POLICY_NAME} --policy-document file://prometheus-rds-exporter.policy.json # Attach IAM policy to IAM role AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text) IAM_POLICY_ARN=arn:aws:iam::${AWS_ACCOUNT_ID}:policy/${IAM_POLICY_NAME} aws iam attach-role-policy --role-name ${IAM_ROLE_NAME} --policy-arn ${IAM_POLICY_ARN}
-
Create an IAM instance profile
EC2_INSTANCE_PROFILE_NAME="prometheus-rds-exporter" # Create IAM instance profile aws iam create-instance-profile --instance-profile-name ${EC2_INSTANCE_PROFILE_NAME} # Attach IAM role to IAM instance profile aws iam add-role-to-instance-profile --instance-profile-name ${EC2_INSTANCE_PROFILE_NAME} --role-name ${IAM_ROLE_NAME}
-
Attach the IAM instance profile to the EC2 instance
EC2_INSTANCE_ID="i-1234567890abcdef0" # Replace with your AWS instance ID aws ec2 associate-iam-instance-profile \ --instance-id ${EC2_INSTANCE_ID} \ --iam-instance-profile Name="${EC2_INSTANCE_PROFILE_NAME}"
-
-
Download the Debian package
PROMETHEUS_RDS_EXPORTER_VERSION=0.3.0 # Replace with latest version PACKAGE_NAME=prometheus-rds-exporter_${PROMETHEUS_RDS_EXPORTER_VERSION}_$(uname -m).deb wget https://github.com/qonto/prometheus-rds-exporter/releases/download/${PROMETHEUS_RDS_EXPORTER_VERSION}/${PACKAGE_NAME}
-
Install package
Prometheus RDS exporter will be automatically started as a service.
dpkg -i ${PACKAGE_NAME}
-
Optional, customize configuration
# Copy configuration template cp /usr/share/prometheus-rds-exporter/prometheus-rds-exporter.yaml.sample /var/lib/prometheus-rds-exporter/prometheus-rds-exporter.yaml # Edit configuration vim /var/lib/prometheus-rds-exporter/prometheus-rds-exporter.yaml # Restart service systemctl restart prometheus-rds-exporter
-
Binary
PROMETHEUS_RDS_EXPORTER_VERSION=0.3.0 # Replace with latest version TARBALL_NAME=prometheus-rds-exporter_Linux_$(uname -m).tar.gz wget https://github.com/qonto/prometheus-rds-exporter/releases/download/${PROMETHEUS_RDS_EXPORTER_VERSION}/${TARBALL_NAME} tar xvzf ${TARBALL_NAME}
-
Optional, customize configuration
vim prometheus-rds-exporter.yaml
-
Start the exporter
./prometheus-rds-exporter
-
Connect on AWS with any method
aws configure
-
Start application
docker run -p 9043:9043 -e AWS_PROFILE=${AWS_PROFILE} -v $HOME/.aws:/app/.aws public.ecr.aws/qonto/prometheus-rds-exporter:latest
If you use Istio and have Prometheus within Istio-system, you'll need to do the following modification before following the install instructions.
-
Get your
values.yaml
for your currently deployed Prometheus system (ex:helm get values RELEASE_NAME [flags]
) -
Edit the values, under
additionalScrapeConfigs
insert an additionaljob_name
:- job_name: prometheus-rds-exporter kubernetes_sd_configs: - namespaces: names: - monitoring role: endpoints
-
Apply the edited values (ex:
helm upgrade prometheus prometheus-community/kube-prometheus-stack -n istio-system -f values.yaml --version 62.2.1
- add the repo if you haven't on helm, and change the repo if you're using another version).
percona/rds_exporter and mtanda/rds_enhanced_monitoring_exporter are great alternatives.
prometheus/cloudwatch_exporter could be used to collect additional CloudWatch metrics.
See CONTRIBUTING.md.
To report a security issue, please visit SECURITY.md
You can start a simple development environment using the Docker compose configuration in /scripts/prometheus
.
It will start Grafana (with the dashboards), Prometheus, and the RDS exporter:
-
Connect on AWS using the AWS CLI
-
Launch development stack
cd scripts/prometheus docker compose up --build
-
Connect on the services
- Grafana: http://localhost:3000 (credential: admin/hackme)
- Prometheus: http://localhost:9090
- Prometheus RDS exporter: http://localhost:9043
Execute Go tests:
make test
Execute Helm chart tests:
make helm-test # Helm unit test
make kubeconform # Kubernetes manifest validation
make checkcov # Check misconfigurations
Prometheus RDS Exporter includes an OpenTelemetry trace exporter to facilitate troubleshooting.
Traces can be forwarded to any OpenTelemetry server using gRPC protocol.
-
Export the
OTEL_EXPORTER_OTLP_ENDPOINT
variable.export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
See OTEL SDK configuration and OpenTelemetry environments variables for all options.
-
Start exporter with OpenTelemetry enabled
prometheus-rds-exporter --enable-otel-traces