Example AWS infrastructure for deploying the load balancing exporter Collector and tail sampling processor Collector.
Learn about sampling, and the different sampling options available in OpenTelemetry, see: https://opentelemetry.io/docs/concepts/sampling/
For this example we are focusing attention on tail (a.k.a tail-aware) sampling.
Head sampling should be transparently applied 'higher' up the signal networking chain: close to the application / instrumentation.
# install pyenv
curl https://pyenv.run | bash
# in this repository
# set the local python version in use
pyenv local $(head -1 .python-version)
# create a virtual environment
python -m venv venv
# activate the environment
source venv/bin/activate
# upgrade pip
python -m pip install --upgrade pip
# install dependencies
python -m pip install -r requirements/development.txt
# install cfn-lint
python -m pip install cfn-lint
# install checkov
python -m pip install checkov
# install aws cli
python -m pip install awscli
# configure aws cli
# (aws account access id and secret access id required)
aws configure
# ensure that when URL are in command that they are not treated as URL to target
aws configure set cli_follow_urlparam false
# user aws permission polices required
* EC2
* ECS
* CloudFormation
* CloudMap
* S3
* SSM
* IAM (Read-Only)
# run linting checks on the template
cfn-lint --template ./stack/cloudformation/main.yaml --region eu-west-1
# run security and compliance checks on the template
checkov --config-file .checkov.yaml -d ./stack/cloudformation
# create the bucket for the stacks
STACK_NAME=StackBucket \
STACK_FILE_NAME=stackbucket.yaml \
./scripts/create_stack.bash
# publish telemetry backend API key as secure parameter
PARAMETER_KEY_NAME=/otel/collector/configuration/telemetry-backend-api-key \
PARAMETER_VALUE="${TELEMETRY_API_KEY}" \
SECURE_STRING=1 \
./scripts/create_update_system_variable.bash
PARAMETER_KEY_NAME=/otel/collector/configuration/telemetry-backend-endpoint \
PARAMETER_VALUE="${TELEMETRY_BACKEND_ENDPOINT}" \
SECURE_STRING=1 \
./scripts/create_update_system_variable.bash
PARAMETER_KEY_NAME=/otel/collector/configuration/telemetry-backend-user \
PARAMETER_VALUE="${TELEMETRY_BACKEND_USER}" \
SECURE_STRING=1 \
./scripts/create_update_system_variable.bash
PARAMETER_KEY_NAME=/otel/collector/configuration/telemetry-backend-token \
PARAMETER_VALUE="${TELEMETRY_BACKEND_TOKEN}" \
SECURE_STRING=1 \
./scripts/create_update_system_variable.bash
# publish Collector configuration as parameters
PARAMETER_KEY_NAME=/otel/collector/configuration/loadbalancing-collector-conf-map-type \
PARAMETER_VALUE="env:COLLECTOR_CONFIGURATION" \
./scripts/create_update_system_variable.bash
PARAMETER_KEY_NAME=/otel/collector/configuration/loadbalancing-collector-configuration \
PARAMETER_VALUE="$(FILE_NAME=loadbalancing-collector-configuration.yaml \
./scripts/convert_file_content_to_string.bash)" \
./scripts/create_update_system_variable.bash
PARAMETER_KEY_NAME=/otel/collector/configuration/tailaware-collector-conf-map-type \
PARAMETER_VALUE="env:COLLECTOR_CONFIGURATION" \
./scripts/create_update_system_variable.bash
PARAMETER_KEY_NAME=/otel/collector/configuration/tailaware-collector-configuration \
PARAMETER_VALUE="$(FILE_NAME=tailaware-collector-configuration.yaml \
./scripts/convert_file_content_to_string.bash)" \
./scripts/create_update_system_variable.bash
# create a parameter-overrides string
STACK_BUCKET_NAME=stack-bucket \
PARAMETER_OVERRIDES_STRING="$(STACK_BUCKET_NAME=${STACK_BUCKET_NAME} \
./scripts/generate_parameter_overrides_string.bash \
main_parameters.yaml \
dev)"
STACK_BUCKET_NAME=stack-bucket \
STACK_NAME=Main \
STACK_FILE_NAME=main.yaml \
PARAMETER_OVERRIDES_STRING="${PARAMETER_OVERRIDES_STRING}" \
./scripts/create_stack.bash
# checking collector functionality
curl -X POST -H "Content-Type: application/json" -d @collector/payload/traces.json -i ALB_ADDRESS/v1/traces
# install cloudformation to terraform transformer
# https://github.com/DontShaveTheYak/cf2tf
python -m pip install cf2tf
# convert cf to tf
cf2tf ./stack/cloudformation/main.yaml -o ./stack/terraform/main
# close the environment when done
deactivate
OpenTelemetry (OTEL) is the complete solution for describing systems via data. It is modular, extensible, adheres to industry standards, and powerful. The tooling provided by OpenTelemetry - as well as the specification - help systems designers deliver on the ideal of a system to be termed as 'observable'.
The choice of telemetry back-end is totally up to user preference. There are a number of systems which store, provide a query engine, and visualise telemetry data as an all-in-one - managed - solution (at various levels of user maturity):
- Grafana Cloud is the most complete telemetry solution on the market totally geared towards OTEL, focused mature users
- NewRelic is a huge product, for novice users, which is not geared towards OTEL
- DataDog is a huge product, for novice users, which is not geared towards OTEL
A fully self-managed solution can be obtained by leveraging:
- Jaeger
: span and metric storage query engine and telemetry visualisation
- NOTE: Grafana is also - in it's basic form - a visualisation tool
- Cassandra: span storage
- Prometheus: metric storage
Getting a Grafana Cloud account, creating an access policy (to ingest OTLP spans, metrics and logs) to 'it', which can then be used as an authentication mechanism to export OTLP from a deployed Collector solution (like this one), is a fantastic decision anyone can make going forward.