Skip to content

Commit b422ee3

Browse files
docs: add page and script for tenant migration (#62)
1 parent 1aee207 commit b422ee3

File tree

4 files changed

+171
-7
lines changed

4 files changed

+171
-7
lines changed
Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,22 @@
11
# Forge Deployment Scenarios
22

3-
- [**First Tenant**](./forge_tenant.md)
3+
* [**First Tenant**](./forge_tenant.md)
44
*Minimal multi-tenant Forge deployment example. Shows how to provision Forge runners and onboard your first tenant.*
55

6-
- [**New Tenant Guide**](./new_tenant.md)
6+
* [**New Tenant Guide**](./new_tenant.md)
77
*Step-by-step checklist for adding a new tenant to an existing Forge deployment, including config files, secrets, and GitHub App setup.*
88

9-
- [**Forge EKS**](./forge_eks.md)
9+
* [**Tenant Migration Guide**](./tenant_migration.md)
10+
*Detailed instructions and automation for migrating a tenant safely between EKS clusters using ForgeMT and ARC.*
11+
12+
* [**Forge EKS**](./forge_eks.md)
1013
*Deploys an EKS cluster with Calico and Karpenter, suitable for running Forge ARC runners and other workloads.*
1114

12-
- [**Forge Integrations**](./forge_integrations.md)
15+
* [**Forge Integrations**](./forge_integrations.md)
1316
*Example for deploying Forge with Splunk, Observability, and other integrations. Includes required modules and configuration tips.*
1417

15-
- [**Splunk Deployment**](./splunk_deployment.md)
18+
* [**Splunk Deployment**](./splunk_deployment.md)
1619
*Complete example for deploying Splunk integrations, including secrets, data manager, and observability modules.*
1720

18-
- [**Extras Deployments**](./forge_extras.md)
21+
* [**Extras Deployments**](./forge_extras.md)
1922
*Deploys supporting infrastructure such as Cloud Custodian, CloudFormation permissions, ECR repositories, Forge subscription, and S3 storage.*
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Migrating a Tenant Between EKS Clusters with ForgeMT + ARC + Terragrunt
2+
3+
## Tenant Migration Steps
4+
5+
1. **Identify Current and Target Clusters**
6+
7+
* Read the current cluster name from the tenant config file (like `arc_cluster_name: srea-forge-euw1-prod-green`).
8+
* Determine the target cluster by switching the suffix from `-green` to `-blue` or vice versa.
9+
10+
2. **Scale Down Runner Sets in the Source Cluster**
11+
12+
* List all runner sets defined under `arc_runner_specs`.
13+
* For each runner set, scale down both minimum and maximum runners to zero on the source cluster to stop all active runner pods.
14+
15+
3. **Disable ARC on the Source Cluster**
16+
17+
* Update the tenant’s config by setting a migration flag (e.g., `migrate_arc_cluster: true`) which disables ARC resources on the source cluster.
18+
* Apply this config change so the Terraform/Terragrunt deployment removes ARC for this tenant in the source cluster.
19+
20+
4. **Enable ARC on the Target Cluster**
21+
22+
* Change the migration flag back to false (`migrate_arc_cluster: false`).
23+
* Update the `arc_cluster_name` to the target cluster (e.g., switching from `srea-forge-euw1-prod-green` to `srea-forge-euw1-prod-blue`).
24+
* Deploy ARC resources in the target cluster with these config changes.
25+
26+
5. **Wait for Runner Pods to Stabilize**
27+
28+
* Verify that runner pods have fully terminated on the source cluster.
29+
* Confirm runner pods are healthy and running on the target cluster.
30+
31+
---
32+
33+
## Automation Script for Tenant Migration
34+
35+
To simplify and standardize the migration process, an automation script is available that performs all the steps described above:
36+
37+
* **Detects the current cluster** from the tenant configuration.
38+
* **Determines the target cluster** by toggling the blue/green suffix.
39+
* **Scales down runner sets** in the source cluster gracefully.
40+
* **Updates the migration flag and cluster name** in the tenant config.
41+
* **Applies Terraform/Terragrunt changes** to disable ARC on the old cluster and enable it on the new one.
42+
* **Waits for runner pods to terminate** before switching, ensuring a clean handoff.
43+
* Provides clear logging at each step for easy monitoring.
44+
45+
### Usage Example
46+
47+
Run the script by specifying the tenant’s Terraform directory and Kubernetes context alias:
48+
49+
```
50+
./scripts/migrate-tenant.sh --tf-dir /full/path/to/tenant_dir --context <kube-context-alias>
51+
```
52+
53+
The script will handle the rest, reducing human error and speeding up the migration process.

scripts/migrate-tenant.sh

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
usage() {
5+
echo "Usage: $0 --tf-dir <terragrunt directory> --context <k8s context alias>"
6+
exit 1
7+
}
8+
9+
parse_args() {
10+
while [[ $# -gt 0 ]]; do
11+
case "$1" in
12+
--tf-dir)
13+
TF_DIR="$2"
14+
shift 2
15+
;;
16+
--context)
17+
FROM_CTX="$2"
18+
shift 2
19+
;;
20+
*)
21+
echo "Unknown arg: $1"
22+
usage
23+
;;
24+
esac
25+
done
26+
27+
[[ -z "${TF_DIR:-}" || -z "${FROM_CTX:-}" ]] && usage
28+
29+
TENANT=$(basename "$TF_DIR")
30+
[[ -z "${TENANT:-}" ]] && {
31+
echo "Error: Could not determine tenant from directory '$TF_DIR'"
32+
exit 1
33+
}
34+
CONFIG_FILE="${TF_DIR}/config.yml"
35+
}
36+
37+
detect_clusters() {
38+
CURRENT_CLUSTER=$(yq e '.arc_cluster_name' "$CONFIG_FILE")
39+
40+
if [[ "$CURRENT_CLUSTER" == *"-green" ]]; then
41+
FROM="$CURRENT_CLUSTER"
42+
TO="${CURRENT_CLUSTER%-green}-blue"
43+
elif [[ "$CURRENT_CLUSTER" == *"-blue" ]]; then
44+
FROM="$CURRENT_CLUSTER"
45+
TO="${CURRENT_CLUSTER%-blue}-green"
46+
else
47+
echo "Cannot detect blue/green suffix in arc_cluster_name: $CURRENT_CLUSTER"
48+
exit 1
49+
fi
50+
}
51+
52+
scale_down_runners() {
53+
echo "🧯 Scaling down runners in namespace: $TENANT"
54+
for key in $(yq -r '.arc_runner_specs | keys | .[]' "$CONFIG_FILE"); do
55+
echo "Checking runner set: $key"
56+
if kubectl --context "$FROM_CTX" get autoscalingrunnersets.actions.github.com -n "$TENANT" "$key" &>/dev/null; then
57+
echo "Scaling down runner set: $key"
58+
kubectl --context "$FROM_CTX" patch autoscalingrunnersets.actions.github.com -n "$TENANT" "$key" --type merge -p '{"spec":{"minRunners":0,"maxRunners":0}}'
59+
else
60+
echo "Runner set $key not found, skipping."
61+
fi
62+
done
63+
64+
echo "⏳ Waiting for runner pods to terminate..."
65+
while kubectl --context "$FROM_CTX" get pods -n "$TENANT" | grep -q runner; do
66+
sleep 5
67+
done
68+
}
69+
70+
terragrunt_apply() {
71+
local target="$1"
72+
echo "🔧 Applying Terragrunt target: $target"
73+
terragrunt apply --target "$target" -working-dir "$TF_DIR" -non-interactive -auto-approve
74+
}
75+
76+
update_config() {
77+
local migrate_flag="$1"
78+
local cluster_name="$2"
79+
80+
yq e -i ".migrate_arc_cluster = $migrate_flag" "$CONFIG_FILE"
81+
yq e -i ".arc_cluster_name = \"$cluster_name\"" "$CONFIG_FILE"
82+
}
83+
84+
main() {
85+
parse_args "$@"
86+
detect_clusters
87+
88+
echo "🔄 Migrating tenant '$TENANT' from '$FROM' to '$TO'..."
89+
echo "📄 Editing config: $CONFIG_FILE"
90+
echo "🔍 Using Kubernetes context: $FROM_CTX"
91+
92+
# Step 1
93+
scale_down_runners
94+
95+
# Step 2
96+
echo "🛑 Disabling ARC for tenant on old cluster '$FROM'"
97+
update_config true "$FROM"
98+
terragrunt_apply 'module.arc_runners'
99+
100+
# Step 3
101+
echo "🚀 Enabling ARC for tenant on new cluster '$TO'"
102+
update_config false "$TO"
103+
terragrunt_apply 'module.arc_runners'
104+
105+
echo "✅ Migration complete. Tenant '$TENANT' is now on '$TO'"
106+
}
107+
108+
main "$@"

scripts/update-github-app-secrets.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ validate_pem_file() {
2727
get_secret_name() {
2828
local terragrunt_dir="$1"
2929
local type="$2"
30-
terragrunt output -json --terragrunt-working-dir "$terragrunt_dir" | jq -r --arg t "$type" '
30+
terragrunt output -json --working-dir "$terragrunt_dir" | jq -r --arg t "$type" '
3131
.tenant.value as $tenant |
3232
"/cicd/common/\($tenant.name)/\($tenant.vpc_alias)/github_actions_runners_app_" + $t
3333
'

0 commit comments

Comments
 (0)