-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enables auto benchmarking / acceptance testing #78
Merged
merschformann
merged 31 commits into
develop
from
merschformann/reenable-auto-benchmarks
Dec 19, 2024
Merged
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
32c1e29
Adding basic acceptance testing script
merschformann 6312791
Preparing test entrypoint for benchmarking (app.yaml, input embedded …
merschformann 937be91
Completing auto benchmark
merschformann 5c6ebfa
Fixing version tag
merschformann ca1ce8e
Fixing version test
merschformann df58314
Cleaning code
merschformann 1adc5f9
Fixing version test
merschformann f2fab6a
Reorganizing prints
merschformann dca29b0
Adding required name parameter
merschformann 46f0463
Adding missing param
merschformann 6e366f1
Removing obsolete print
merschformann 8937c46
Merge remote-tracking branch 'origin/develop' into merschformann/reen…
merschformann 183e41a
Adding auto benchmark workflow
merschformann 47091dd
Adding branch name for pacing slack notifications
merschformann 738a6c2
Pass secrets
merschformann 71d1a2b
Using output summary, improved slack message
merschformann c68feb2
More specific slack message
merschformann dd6b523
Bump nextmv version
merschformann a5fe09b
Fixing api key
merschformann 5bc5cad
Forward account id
merschformann 92bc042
Moving account id to env vars
merschformann bd8cf40
Renaming job
merschformann 037545b
Fixing use of variable
merschformann 3c4a84a
Use official nextmv-py release
merschformann 6990c5a
Upgrade baseline when running on main branch
merschformann 87984e5
Move env var to step that needs it
merschformann 2256778
Moving some env vars to GH env section
merschformann d54638f
Using shifted geometric mean instead
merschformann 38a70a5
Fixing id and tag handling
merschformann b673876
Pre-output url
merschformann 632b196
Flushing url
merschformann File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
name: auto benchmark | ||
on: [push] | ||
|
||
env: | ||
GO_VERSION: 1.23 | ||
PYTHON_VERSION: 3.12 | ||
|
||
jobs: | ||
auto-benchmark: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: git clone | ||
uses: actions/checkout@v4 | ||
|
||
- name: set up Go | ||
uses: actions/setup-go@v5 | ||
with: | ||
go-version: ${{ env.GO_VERSION }} | ||
|
||
- name: set up Python ${{ env.PYTHON_VERSION }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ env.PYTHON_VERSION }} | ||
|
||
- name: install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -r .nextmv/benchmark.requirements.txt | ||
|
||
- name: run acceptance test | ||
env: | ||
BENCHMARK_ACCOUNT_ID: ${{ vars.BENCHMARK_ACCOUNT_ID }} | ||
BENCHMARK_API_KEY_PROD: ${{ secrets.BENCHMARK_API_KEY_PROD }} | ||
SLACK_URL_DEV_SCIENCE: ${{ secrets.SLACK_URL_DEV_SCIENCE }} | ||
run: | | ||
export BRANCH_NAME=$(echo $GITHUB_REF | awk -F'/' '{print $3}') | ||
python .nextmv/benchmark.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,219 @@ | ||
# Description: | ||
# This script does the following: | ||
# - Make sure the working directory is clean. | ||
# - Pushes a new version of the app (if it does not already exist; uses git sha as version). | ||
# - Updates the candidate instance to use the new version. | ||
# - Runs an acceptance test between the candidate and baseline instances. | ||
# - Waits for the test to complete. | ||
# - Posts the result to Slack (if requested). | ||
|
||
import os | ||
import subprocess | ||
from datetime import datetime, timezone | ||
|
||
import requests | ||
from nextmv import cloud | ||
|
||
APP_ID = "nextroute-bench" | ||
API_KEY = os.environ["BENCHMARK_API_KEY_PROD"] | ||
SLACK_WEBHOOK = os.getenv("SLACK_URL_DEV_SCIENCE", None) | ||
ACCOUNT_ID = os.getenv("BENCHMARK_ACCOUNT_ID", None) | ||
BRANCH_NAME = os.getenv("BRANCH_NAME", None) | ||
|
||
|
||
METRICS = [ | ||
cloud.Metric( | ||
field="result.value", | ||
metric_type=cloud.MetricType.direct_comparison, | ||
params=cloud.MetricParams( | ||
tolerance=cloud.MetricTolerance( | ||
value=0.05, | ||
type=cloud.ToleranceType.relative, | ||
), | ||
operator=cloud.Comparison.less_than_or_equal_to, | ||
), | ||
statistic=cloud.StatisticType.shifted_geometric_mean, | ||
) | ||
] | ||
|
||
|
||
def ensure_clean_working_directory(): | ||
""" | ||
Ensure the working directory is clean by throwing an exception if it is not. | ||
""" | ||
if os.system("git diff --quiet") != 0 or os.system("git diff --cached --quiet") != 0: | ||
raise Exception("Working directory is not clean") | ||
|
||
|
||
def get_id(app: cloud.Application) -> tuple[str, str]: | ||
""" | ||
Get the ID for the new version (and just the tag). | ||
If the version already exists, we append a timestamp to the ID. | ||
""" | ||
# Create ID based on git sha. | ||
tag = subprocess.check_output(["git", "rev-parse", "HEAD"]).decode().strip()[0:8] | ||
version_id = f"auto-{tag}" | ||
# If the version already exists, we append a timestamp to the ID. | ||
exists = False | ||
try: | ||
app.version(version_id) | ||
exists = True | ||
except Exception: | ||
pass | ||
if exists: | ||
ts = ( | ||
datetime.now(timezone.utc) | ||
.replace(microsecond=0) | ||
.isoformat() | ||
.replace("+00:00", "Z") | ||
.replace(":", "") | ||
.replace("-", "") | ||
) | ||
version_id = f"{version_id}-{ts}" | ||
tag = f"{tag}-{ts}" | ||
# Otherwise, we just use the git sha. | ||
return version_id, tag | ||
|
||
|
||
def push_new_version(app: cloud.Application, tag: str) -> None: | ||
""" | ||
Push a new version of the app and update the candidate instance to use it. | ||
""" | ||
app.push(app_dir=".") | ||
app.new_version( | ||
id=tag, | ||
name=f"Auto version {tag}", | ||
description=f"Automatically generated version {tag}", | ||
) | ||
instance = app.instance("candidate") | ||
app.update_instance( | ||
id="candidate", | ||
version_id=tag, | ||
name=instance.name, # Name is required, but we don't want to change it | ||
) | ||
|
||
|
||
def upgrade_baseline(app: cloud.Application, version_id: str) -> None: | ||
""" | ||
Upgrade the baseline instance to use the new version. | ||
""" | ||
instance = app.instance("baseline") | ||
app.update_instance( | ||
id="baseline", | ||
version_id=version_id, | ||
name=instance.name, # Name is required, but we don't want to change it | ||
) | ||
|
||
|
||
def run_acceptance_test( | ||
app: cloud.Application, | ||
id: str, | ||
tag: str, | ||
) -> cloud.AcceptanceTest: | ||
""" | ||
Run an acceptance test between the candidate and baseline instances. | ||
""" | ||
result = app.new_acceptance_test_with_result( | ||
candidate_instance_id="candidate", | ||
baseline_instance_id="baseline", | ||
id=id, | ||
metrics=METRICS, | ||
name=f"Auto-test {tag}", | ||
description=f"Automated test for {tag}", | ||
input_set_id="nextroute-bench-v20", | ||
polling_options=cloud.PollingOptions( | ||
max_duration=600, # 10 minutes | ||
max_tries=1000, # basically forever - we'll stop by duration | ||
), | ||
) | ||
return result | ||
|
||
|
||
def create_test_url(result_id: str) -> str: | ||
""" | ||
Create a URL to the acceptance test result. | ||
""" | ||
if ACCOUNT_ID: | ||
return f"https://cloud.nextmv.io/acc/{ACCOUNT_ID}/app/nextroute-bench/experiment/acceptance/{result_id}" | ||
return "unavailable" | ||
|
||
|
||
def write_to_summary(content): | ||
"""Appends content to the GitHub Actions step summary (if available).""" | ||
summary_file = os.getenv("GITHUB_STEP_SUMMARY") | ||
if not summary_file: | ||
return | ||
|
||
# Write content to the summary file | ||
with open(summary_file, "a") as f: | ||
f.write(content + "\n") | ||
|
||
|
||
def main(): | ||
""" | ||
Main function that runs the benchmark. | ||
""" | ||
# Change to the directory of the app (sibling directory of this script) | ||
os.chdir(os.path.join(os.path.dirname(__file__), "..", "cmd")) | ||
|
||
print("Making sure the working directory is clean...") | ||
ensure_clean_working_directory() | ||
|
||
client = cloud.Client(api_key=API_KEY) | ||
app = cloud.Application(client=client, id=APP_ID) | ||
|
||
id, tag = get_id(app) # id is used as version and acceptance test ID | ||
|
||
print(f"Pushing new version with ID: {id}") | ||
push_new_version(app, id) | ||
|
||
write_to_summary("# Acceptance Test Report") | ||
write_to_summary("") | ||
write_to_summary(f"ID: {id}") | ||
url = create_test_url(id) | ||
write_to_summary(f"Link: [link]({url})") | ||
print(f"::notice::Acceptance test URL: {url}", flush=True) | ||
|
||
print(f"Running acceptance test with ID: {id}") | ||
print("Waiting for it to complete...") | ||
result = run_acceptance_test(app, id, tag) | ||
passed = "unknown" | ||
if result and result.results: | ||
passed = "passed" if result.results.passed else "failed" | ||
print(f"Acceptance test completed with status: {passed}") | ||
|
||
if SLACK_WEBHOOK and BRANCH_NAME == "develop": | ||
print("Posting to Slack...") | ||
response = requests.post( | ||
SLACK_WEBHOOK, | ||
json={ | ||
"text": f"nextroute acceptance test {result and result.id} completed with status: {passed}" | ||
+ f" (<{create_test_url(result and result.id)}|View results>)", | ||
}, | ||
) | ||
|
||
if response.status_code != 200: | ||
print(f"Failed to send notification to Slack: {response.text}") | ||
else: | ||
print("Notification sent to Slack") | ||
|
||
write_to_summary("") | ||
write_to_summary(f"Result: {passed}") | ||
if result and result.results: | ||
if result.results.error: | ||
write_to_summary(f"Error: {result.results.error}") | ||
else: | ||
write_to_summary("Metrics:") | ||
write_to_summary("") | ||
for metric in result.results.metric_results: | ||
write_to_summary(f"- {metric.metric.field}: {metric.passed}") | ||
|
||
if BRANCH_NAME == "develop": | ||
print("Upgrading baseline instance to use the new version...") | ||
upgrade_baseline(app, id) | ||
|
||
print("Done") | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
nextmv>=v0.14.2 | ||
requests>=2.32.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# This manifest holds the information the app needs to run on the Nextmv Cloud. | ||
type: go | ||
runtime: ghcr.io/nextmv-io/runtime/default:latest | ||
build: | ||
command: go build -o main . | ||
environment: | ||
GOOS: linux | ||
GOARCH: arm64 | ||
files: | ||
- main |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could have used
nextmv.Options
for this 👻There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, fair. I suppose I should get used to it more. 😊