Skip to content
This repository has been archived by the owner on Sep 21, 2021. It is now read-only.

A CLI tool that automates benchmarking on a range of EC2 instance types.

License

Notifications You must be signed in to change notification settings

awslabs/amazon-ec2-instance-qualifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Amazon EC2 Instance Qualifier

A CLI tool that automates benchmarking on a range of EC2 instance types.

go-version license go-report-card build-status


Summary

How do users know which EC2 instance types are compatible with their application? Currently, there exists no tooling or baselining of any kind provided by AWS. If a user wants to see which of the 250+ different instance types are acceptable, then the user must spin up each instance type individually and test their application’s performance. Spot users often find themselves asking this question when they are told to utilize as many different instance types as possible in order to reduce the chance of spot interruptions. Still, most users will only ever choose a small subset of what could be acceptable due to the pain and cost of manual testing.

The instance qualifier is an open source command line tool that automates benchmarking on a range of EC2 instance types. The user will use the CLI to provide a test suite and a list of EC2 instance types. Instance qualifier will then run the input on all designated types, test against multiple metrics, and output the results in a user friendly format. In this way, instance qualifier will automate testing across instance types and address a severe pain point for spot users and EC2 users looking to venture into other instance type families.

Major Features

  • Executes test suite on a range of EC2 instance types in parallel and persists test results and execution times
  • Installs and configures CloudWatch Agent on each instance type for capturing benchmark data
    • Instance-Qualifier uses the following for benchmarking: cpu_usage_active and mem_used_percent
    • More information on these metrics can be found here
  • Provides an ingress point for users to add their own logic to be executed in instance user data via --custom-script flag
  • Supports asynchronous functionality, which means users can exit the CLI after tests begin and resume the session at a later time to fetch the results
  • Uses AWS CloudFormation to manage all resources
  • Creates an S3 bucket to store test results, instance logs, user configuration and CloudFormation template
  • Implements mechanisms to ensure infrastructure deletion for various edge cases

Impact to AWS Account

  • The CLI creates a CloudFormation stack with a series of resources during the run and deletes the stack at the end by default. Resources include:
    • A VPC + Subnet + Internet Gateway: used to launch instances. Note that they are only created if you don't specify vpc/subnet flags or provide invalid ones
    • A Security Group: same as the default security group when you create one using AWS Console. It has an inbounding rule which opens all ports for all traffic and all protocols, but the source must be within the same security group. With this rule, the instances can access the bucket, but won't be affected by any other traffic coming outside of the security group
    • An IAM Role: attached with AmazonS3FullAccess and CloudWatchAgentServerPolicy policies to allow instances to access the bucket and emit CloudWatch metrics, respectively
    • Launch Templates: used to launch auto scaling group and instances
    • An Auto Scaling Group: the reason we use auto scaling group to manage all instances is that an one-time action can be scheduled to terminate all instances in the group after timeout to ensure the user is not excessively charged
    • EC2 Instances
  • An S3 bucket containing the raw data of an Instance-Qualifier run is also created; however, this artifact is persisted by default
  • A sample of this CloudFormation stack can be found here
  • If a fatal error occurs or the user presses Ctrl-C during the run, the CLI deletes the resources appropriately. Note that if the CLI is interrupted when the tests have begun on all instances, it thinks that the user may resume the session at a later time, thus won't delete any resources
  • No impact to any original resources or settings of the AWS account

Disclaimer: All associated costs are the user's responsibility.

Configuration

To execute the CLI, you will need AWS credentials configured. Take a look at the AWS CLI configuration documentation for details on the various ways to configure credentials. An easy way to try out the ec2-instance-qualifier CLI is to populate the following environment variables with your AWS API credentials.

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."

If you already have an AWS CLI profile setup, you can pass that directly into ec2-instance-qualifier:

$ ./ec2-instance-qualifier --instance-types=m4.large --test-suite=test-folder --cpu-threshold=30 --mem-threshold=30 --profile=my-aws-cli-profile --region=us-east-1

You can set the AWS_REGION environment variable if you don't want to pass in --region on each run.

export AWS_REGION="us-east-1"

Examples

Note: the working directory where you execute ec2-instance-qualifier must contain the agent binary file

All CLI Options

$ ./ec2-instance-qualifier --help
ec2-instance-qualifier is a CLI tool that automates testing on a range of EC2 instance types.
Provided with a test suite and a list of EC2 instance types, ec2-instance-qualifier will then
run the input on all designated types, test against multiple metrics, and output the results
in a user friendly format

Usage:
  ec2-instance-qualifier [flags]

Examples:
./ec2-instance-qualifier --instance-types=m4.large,c5.large,m4.xlarge --test-suite=path/to/test-folder --cpu-threshold=30 --mem-threshold=30 --vpc=vpc-294b9542 --subnet=subnet-4879bf23 --timeout=2400
./ec2-instance-qualifier --instance-types=m4.xlarge,c1.large,c5.large --test-suite=path/to/test-folder --cpu-threshold=30 --mem-threshold=30 --profile=default
./ec2-instance-qualifier --bucket=qualifier-Bucket-123456789abcdef

Flags:
  -ami string
        [OPTIONAL] ami id
  -bucket string
        [OPTIONAL] the name of the Bucket created in the last run. When provided with this flag, the CLI won't create new resources, but try to grab test results from the Bucket. If you provide this flag, you don't need to specify any required flags
  -config-file string
        [OPTIONAL] path to config file for cli input parameters in JSON
  -cpu-threshold int
        [REQUIRED] % cpu utilization that should not be exceeded measured by cpu_usage_active. ex: 30 means instances using 30% or less CPU SUCCEED
  -custom-script string
        [OPTIONAL] path to Bash script to be executed on instance-types BEFORE agent runs test-suite and monitoring
  -instance-types string
        [REQUIRED] comma-separated list of instance-types to test
  -mem-threshold int
        [REQUIRED] % of memory used that should not be exceeded measured by mem_used_percent. ex: 30 means instances using 30% or less MEM SUCCEED
  -persist
        [OPTIONAL] set to true if you'd like the tool to keep the CloudFormation stack after the run. Default is deleting the stack
  -profile string
        [OPTIONAL] AWS CLI Profile to use for credentials and config
  -region string
        [OPTIONAL] AWS Region to use for API requests
  -subnet string
        [OPTIONAL] subnet id
  -test-suite string
        [REQUIRED] folder containing test files to execute
  -timeout int
        [OPTIONAL] max seconds for test-suite execution on instances (default 3600)
  -vpc string
        [OPTIONAL] vpc id

Example 1: Test against m4.large and m4.xlarge with cpu and memory thresholds of 80% and 30%, respectively in an existing VPC and subnet (logs not included in output below)

$ ./ec2-instance-qualifier --instance-types=m4.large,m4.xlarge --test-suite=test-folder --cpu-threshold=80 --mem-threshold=30 --timeout=3600 --vpc=vpc-294b9542 --subnet=subnet-4879bf23
Region Used: us-east-2
Test Run ID: opcfxoss0uyxym4
Bucket Created: qualifier-bucket-opcfxoss0uyxym4
Stack Created: qualifier-stack-opcfxoss0uyxym4
The execution of test suite has been kicked off on all instances. You may quit now and later run the CLI again with the bucket name flag to get the result
+---------------+---------+------------------+---------------+------------------+---------------+-----------------+----------------------------+
| INSTANCE TYPE | STATUS  | CPU_USAGE_ACTIVE | CPU_THRESHOLD | MEM_USED_PERCENT | MEM_THRESHOLD | ALL TESTS PASS? | TOTAL EXECUTION TIME (sec) |
+---------------+---------+------------------+---------------+------------------+---------------+-----------------+----------------------------+
|   m4.xlarge   | SUCCESS |      50.11       |     80.00     |       0.82       |     10.00     |      true       |           130.71           |
+---------------+---------+------------------+---------------+------------------+---------------+-----------------+----------------------------+
|   m4.large    | SUCCESS |      100.00      |     80.00     |       1.44       |     10.00     |      true       |           130.70           |
+---------------+---------+------------------+---------------+------------------+---------------+-----------------+----------------------------+



Detailed test results can be found in s3://qualifier-bucket-opcfxoss0uyxym4/Instance-Qualifier-Run-opcfxoss0uyxym4
User configuration and CloudFormation template are stored in the root directory of the bucket. You may check them if you want
The process of cleaning up stack resources has started. You can quit now
Completed!

A unique ID is assigned to each test run and the bucket and stack names also contain the ID. From the results, we know that all instances ran the whole test suite successfully, but only m4.xlarge succeeded to operate below both cpu and memory thresholds.

Example 2: Same as Example 1, but using a config file instead of CLI args

$ cat iq-config.json
{
	"instance-types": "m4.large,m4.xlarge",
	"test-suite": "test-folder",
	"cpu-threshold": 80,
	"mem-threshold": 10,
	"vpc": "vpc-294b9542",
	"subnet": "subnet-4879bf23",
	"ami": "",
	"timeout": 3600,
	"persist": false,
	"profile": "",
	"region": "us-east-2",
	"bucket": "",
	"custom-script": ""
}


$ ./ec2-instance-qualifier --config-file=iq-config.json
(Same output as Example 1)

Example 3: Prompt due to an instance-type not supporting AMI

$ ./ec2-instance-qualifier --instance-types=m4.xlarge,a1.large --test-suite=test-folder --cpu-threshold=95 --mem-threshold=30
Region Used: us-east-2
Test Run ID: n3lytbolzfaq3np
Bucket Created: qualifier-bucket-n3lytbolzfaq3np
Instance types [a1.large] are not supported due to AMI or Availability Zone. Do you want to proceed with the rest instance types [m5n.large] ? y/N
y
Stack Created: qualifier-stack-n3lytbolzfaq3np
The execution of test suite has been kicked off on all instances. You may quit now and later run the CLI again with the bucket name flag to get the result

The default AMI (Amazon Linux 2) is not compatible with a1.large architecture; therefore, the CLI prompts the user whether to continue the instance-qualifier run with compatible instance types only.

Example 3.5: Exit CLI after stack creation, then resume

(...continued from above)
The execution of test suite has been kicked off on all instances. You may quit now and later run the CLI again with the bucket name flag to get the result
^C

$ ./ec2-instance-qualifier --bucket=qualifier-bucket-n3lytbolzfaq3np
Region Used: us-east-2
Test Run ID: n3lytbolzfaq3np
Bucket Used: qualifier-bucket-n3lytbolzfaq3np
+---------------+---------+------------------+---------------+------------------+---------------+-----------------+----------------------------+
| INSTANCE TYPE | STATUS  | CPU_USAGE_ACTIVE | CPU_THRESHOLD | MEM_USED_PERCENT | MEM_THRESHOLD | ALL TESTS PASS? | TOTAL EXECUTION TIME (sec) |
+---------------+---------+------------------+---------------+------------------+---------------+-----------------+----------------------------+
|   m4.xlarge   | SUCCESS |      50.11       |     80.00     |       0.82       |     10.00     |      true       |           130.71           |
+---------------+---------+------------------+---------------+------------------+---------------+-----------------+----------------------------+


Detailed test results can be found in s3://qualifier-bucket-n3lytbolzfaq3np/Instance-Qualifier-Run-n3lytbolzfaq3np
User configuration and CloudFormation template are stored in the root directory of the bucket. You may check them if you want
The process of cleaning up stack resources has started. You can quit now
^C

The CLI is interrupted after tests began executing on instances, then resumed by providing the bucket flag. Quitting before the you may quit now messaging results in both the CloudFormation stack and S3 bucket getting deleted.

Interpreting Results

Table Headers

  • INSTANCE TYPE: instance type
  • STATUS: SUCCESS if max CPU and max MEM are less than their respective thresholds; FAIL otherwise
  • CPU_USAGE_ACTIVE: max cpu_usage_active recorded (p100) over the duration of instance-qualifier run
  • CPU_THRESHOLD: cpu threshold set by user
  • MEM_USED_PERCENT: max mem_used_percent recorded (p100) over the duration of instance-qualifier run
  • MEM_THRESHOLD: mem threshold set by user
  • ALL TESTS PASS?: true if all tests execute successfully (without an error code); false otherwise
  • TOTAL EXECUTION TIME: how long it took the instance to execute all tests in seconds

Building

For build instructions please consult BUILD.md.

Communication

If you've run into a bug or have a new feature request, please open an issue.

Contributing

Contributions are welcome! Please read our guidelines and our Code of Conduct.

License

This project is licensed under the Apache-2.0 License.

About

A CLI tool that automates benchmarking on a range of EC2 instance types.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published