Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SLO Computer for DevOps automation #6

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
FROM golang:1.16-alpine AS builder

WORKDIR /app

# Copy go mod and sum files
COPY go.mod go.sum ./

# Download dependencies
RUN go mod download

# Copy source code
COPY . .

# Build the application
RUN CGO_ENABLED=0 GOOS=linux go build -o slo-computer

# Use a minimal alpine image for the final image
FROM alpine:3.14

WORKDIR /app

# Copy the binary from the builder stage
COPY --from=builder /app/slo-computer /app/slo-computer

# Create a non-root user to run the application
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

ENTRYPOINT ["/app/slo-computer"]
24 changes: 23 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,15 @@ GOMOD=$(GOCMD) mod
BINARY_NAME=slo-computer
GO111MODULE=on
GOFLAGS=-mod=vendor
DOCKER_IMAGE=last9/slo-computer

.PHONY: all build clean test run deps vendor tidy help
.PHONY: all build clean test run deps vendor tidy help docker docker-push docker-run

all: deps build

build:
@echo "Building SLO Computer..."
GO111MODULE=$(GO111MODULE) $(GOMOD) tidy
GO111MODULE=$(GO111MODULE) $(GOBUILD) -o $(BINARY_NAME) -v

clean:
Expand Down Expand Up @@ -44,6 +46,18 @@ tidy:
@echo "Tidying dependencies..."
GO111MODULE=$(GO111MODULE) $(GOMOD) tidy

docker:
@echo "Building Docker image..."
docker build -t $(DOCKER_IMAGE):latest .

docker-push: docker
@echo "Pushing Docker image..."
docker push $(DOCKER_IMAGE):latest

docker-run: docker
@echo "Running Docker container..."
docker run --rm $(DOCKER_IMAGE):latest

# Example targets for common commands
example-service:
@echo "Running service SLO example..."
Expand All @@ -53,6 +67,10 @@ example-cpu:
@echo "Running CPU burst example..."
./$(BINARY_NAME) cpu-suggest --instance=t3a.xlarge --utilization=15

example-json:
@echo "Running service SLO example with JSON output..."
./$(BINARY_NAME) suggest --throughput=4200 --slo=99.9 --duration=720 --output=json

# Help command
help:
@echo "SLO Computer Makefile"
Expand All @@ -66,8 +84,12 @@ help:
@echo " make deps Ensure dependencies are downloaded"
@echo " make vendor Create vendor directory"
@echo " make tidy Tidy go.mod file"
@echo " make docker Build Docker image"
@echo " make docker-push Push Docker image to registry"
@echo " make docker-run Run Docker container"
@echo " make example-service Run an example service SLO calculation"
@echo " make example-cpu Run an example CPU burst calculation"
@echo " make example-json Run an example with JSON output"
@echo ""
@echo "Environment variables:"
@echo " GO111MODULE Controls Go modules behavior (default: on)"
14 changes: 14 additions & 0 deletions OPEN_ISSUES.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ This document tracks potential improvements and issues for the SLO Computer proj
- [ ] Improve function and variable naming for clarity
- [ ] Add more comprehensive documentation to exported functions
- [ ] Refactor long functions into smaller, more focused ones
- [ ] Implement structured output formatters (JSON/YAML)
- [ ] Add configuration file support

## Error Handling

Expand All @@ -17,6 +19,7 @@ This document tracks potential improvements and issues for the SLO Computer proj
- [ ] Add validation for all user inputs
- [ ] Implement proper error wrapping with context
- [ ] Add recovery mechanisms for panics in initialization code
- [ ] Ensure machine-readable error formats for automation

## Testing

Expand All @@ -25,6 +28,8 @@ This document tracks potential improvements and issues for the SLO Computer proj
- [ ] Create test fixtures for common scenarios
- [ ] Add benchmarks for performance-critical code
- [ ] Implement test coverage reporting
- [ ] Add tests for configuration file parsing
- [ ] Add tests for output formatters

## User Experience

Expand Down Expand Up @@ -55,6 +60,10 @@ This document tracks potential improvements and issues for the SLO Computer proj
- [ ] Add export functionality for alerting systems (Prometheus, Datadog, etc.)
- [ ] Support for multi-window, multi-burn-rate alerting policies
- [ ] Add historical data analysis for SLO recommendation
- [ ] Create a Dockerfile for containerization
- [ ] Develop GitHub Actions integration
- [ ] Implement Prometheus integration for metrics analysis
- [ ] Create a lightweight API server mode

## Documentation

Expand All @@ -65,6 +74,9 @@ This document tracks potential improvements and issues for the SLO Computer proj
- [ ] Create contributor guidelines
- [ ] Develop a visual guide explaining multi-window, multi-burn-rate alerting
- [ ] Add troubleshooting section for common alert implementation issues
- [ ] Document configuration file format
- [ ] Add examples for CI/CD integration
- [ ] Create API documentation

## Dependencies and Build

Expand All @@ -73,6 +85,8 @@ This document tracks potential improvements and issues for the SLO Computer proj
- [ ] Modernize GitHub Actions workflow
- [ ] Add Dependabot for automated dependency updates
- [ ] Implement module versioning strategy
- [ ] Add Docker build process
- [ ] Create release automation

## Configuration

Expand Down
131 changes: 125 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,24 @@ This toolkit helps SREs and DevOps engineers:
### Prerequisites
- Go 1.16 or later

### Building from Source
### Installation Options

#### Using Go
```bash
# Install directly using Go
go install github.com/last9/slo-computer@latest
```

#### Using Docker
```bash
# Pull the Docker image
docker pull last9/slo-computer:latest

# Run using Docker
docker run last9/slo-computer:latest --help
```

#### Building from Source
```bash
# Clone the repository
git clone https://github.com/last9/slo-computer.git
Expand Down Expand Up @@ -63,8 +79,10 @@ usage: slo [<flags>] <command> [<args> ...]
Last9 SLO toolkit

Flags:
--help Show context-sensitive help (also try --help-long and --help-man).
--version Show application version.
--help Show context-sensitive help (also try --help-long and --help-man).
--version Show application version.
--config=CONFIG Path to configuration file
--output=FORMAT Output format (text, json, yaml)

Commands:
help [<command>...]
Expand All @@ -88,10 +106,89 @@ Commands:
- `--instance`: AWS instance type (e.g., t3.micro, t3a.xlarge)
- `--utilization`: Average CPU utilization percentage (0-100)

The goal of these commands is to factor in some "bare minimum" input to:
### Using Configuration Files

You can define your services and configurations in YAML or JSON files:

```yaml
# slo-config.yaml
services:
api-gateway:
throughput: 4200
slo: 99.9
duration: 720

background-processor:
throughput: 100
slo: 99.5
duration: 168

cpus:
web-server:
instance: t3a.xlarge
utilization: 15
```

Then use it with:

```bash
# For a specific service
./slo-computer suggest --config=slo-config.yaml --service=api-gateway

# For a specific CPU
./slo-computer cpu-suggest --config=slo-config.yaml --service=web-server
```

### Output Formats

- Determine if this is a low traffic service where an SLO approach makes little sense
- Compute the _actual_ alert values and conditions to set alerts on
SLO Computer supports multiple output formats:

```bash
# Default text output
./slo-computer suggest --throughput=4200 --slo=99.9 --duration=720

# JSON output
./slo-computer suggest --throughput=4200 --slo=99.9 --duration=720 --output=json

# YAML output
./slo-computer suggest --throughput=4200 --slo=99.9 --duration=720 --output=yaml
```

## CI/CD Integration

### GitHub Actions

You can use SLO Computer in your GitHub Actions workflows:

```yaml
name: SLO Analysis

on:
schedule:
- cron: '0 0 * * 1' # Weekly on Monday
workflow_dispatch: # Manual trigger

jobs:
analyze-slos:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

- name: Run SLO Computer
uses: last9/slo-computer-action@v1
with:
command: suggest
config-file: .github/slo-config.yaml
service-name: api-gateway
output-format: json
```

### Docker Integration

```bash
# Mount your config file and run
docker run -v $(pwd)/slo-config.yaml:/config.yaml last9/slo-computer:latest suggest --config=/config.yaml --service=api-gateway
```

## Examples

Expand All @@ -115,6 +212,28 @@ This alert will trigger once 1.39% of error budget is consumed,
and leaves 72h0m0s before the SLO is defeated.
```

JSON Output:
```json
[
{
"type": "slow_burn",
"error_rate": 0.002,
"long_window": "24h0m0s",
"short_window": "2h0m0s",
"budget_consumed": 0.0667,
"time_remaining": "360h0m0s"
},
{
"type": "fast_burn",
"error_rate": 0.01,
"long_window": "1h0m0s",
"short_window": "5m0s",
"budget_consumed": 0.0139,
"time_remaining": "72h0m0s"
}
]
```

**Q: What about a low-traffic service?**

```bash
Expand Down
51 changes: 51 additions & 0 deletions action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
name: 'SLO Computer'
description: 'Calculate SLO-based alert thresholds for services and AWS burstable instances'
author: 'Last9'
inputs:
command:
description: 'Command to run (suggest or cpu-suggest)'
required: true
config-file:
description: 'Path to configuration file'
required: false
service-name:
description: 'Service name to use from config file'
required: false
throughput:
description: 'Service throughput (requests per minute)'
required: false
slo:
description: 'Desired SLO percentage'
required: false
duration:
description: 'SLO duration in hours'
required: false
instance:
description: 'AWS instance type'
required: false
utilization:
description: 'CPU utilization percentage'
required: false
output-format:
description: 'Output format (text, json, yaml)'
required: false
default: 'json'
outputs:
result:
description: 'SLO calculation result'
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{ inputs.command }}
- ${{ inputs.config-file != '' && format('--config={0}', inputs.config-file) || '' }}
- ${{ inputs.service-name != '' && format('--service={0}', inputs.service-name) || '' }}
- ${{ inputs.throughput != '' && format('--throughput={0}', inputs.throughput) || '' }}
- ${{ inputs.slo != '' && format('--slo={0}', inputs.slo) || '' }}
- ${{ inputs.duration != '' && format('--duration={0}', inputs.duration) || '' }}
- ${{ inputs.instance != '' && format('--instance={0}', inputs.instance) || '' }}
- ${{ inputs.utilization != '' && format('--utilization={0}', inputs.utilization) || '' }}
- ${{ format('--output={0}', inputs.output-format) }}
branding:
icon: 'alert-circle'
color: 'green'
Loading