This repository contains configuration files for the testing and automation needs of the Gardener project.
This is currently under construction / in evaluation phase.
Gardener uses a prow
instance at prow.gardener.cloud to handle CI and automation for parts of the project.
Everyone can participate in a self-service PR-based workflow, where changes are automatically deployed after they have been reviewed and merged.
All job configs are located in config/jobs
.
The results of prow jobs can be visualized in TestGrid in dashboards at testgrid.k8s.io/gardener. We don't run our own TestGrid installation, but include our dashboards into the TestGrid installation of Kubernetes.
We configured dashboards for each of our repositories where we run tests with prow. You find them at config/testgrids/config.yaml.
When the desired dashboard is defined, you can add your prow job to a dashboard annotating them like in the example below.
annotations:
testgrid-dashboards: dashboard-name # [Required] A dashboard already defined in gardener-testgrid.yaml.
testgrid-tab-name: some-short-name # [Optional] A shorter name for the tab. If omitted, just uses the job name.
testgrid-alert-email: [email protected] # [Optional] An alert email that will be applied to the tab created in the first dashboard specified in testgrid-dashboards.
description: Words about your job. # [Optional] A description of your job. If omitted, only the job name is used.
testgrid-num-columns-recent: "10" # [Optional] The number of runs in a row that can be omitted before the run is considered stale. The default value is 10.
testgrid-num-failures-to-alert: "3" # [Optional] The number of continuous failures before sending an email. The default value is 3.
testgrid-days-of-results: "15" # [Optional] The number of days for which the results are visible. The default value is 15.
testgrid-alert-stale-results-hours: "12" # [Optional] The number of hours that pass with no results after which the email is sent. The default value is 12.
For postsubmit
and periodic
prow jobs there will be a test-group created automatically. If you don't want to add them to TestGrid please use this annotation to disable creation of a test-group. For presubmit
prow jobs no test-group will be created unless you annotate them as in the previous example.
annotations:
testgrid-create-test-group: "false"
You can test your TestGrid configuration locally with the ./hack/check-testgrid-config.sh
. Please open a PR for ci-infra
repository for your new configuration. When it is merged the new configuration will be pushed to gs://gardener-prow/testgrid/config
automatically and your jobs will become visible at testgrid.k8s.io/gardener soon.
The scripts from this repository rely on a combined kubeconfig
. It contains two contexts for the prow clusters gardener-prow-trusted
, gardener-prow-build
and one for the Gardener project the clusters are created in.
Please setup your local kubeconfig file by using the hack/setup-prow-kubeconfig.sh
script. Afterwards, you find it here:
export KUBECONFIG=~/.gardener-prow/kubeconfig/kubeconfig--gardener--prow-combined.yaml
The kubeconfig contains absolute paths. Thus, it won't work anymore, if you move it to a different location.
The following commands assume you are using the combined kubeconfig
generated in the previous section. When you create new clusters the configuration of gardener-prow-trusted
, gardener-prow-build
contexts will be incomplete in the beginning. They are completed in step 2 when the clusters have been created.
-
Create the prow cluster and prow workload cluster.
Please copy cluster spec from prow config GCS bucket to your /tmp folder and run these commands.
kubectl config use-context garden-cluster kubectl apply -f /tmp/clusters/prow-trusted.yaml kubectl apply -f /tmp/clusters/prow-build.yaml
-
Complete your combined kubeconfig with the data of the clusters created in the previous step
-
Create the
prow
namespace in the prow cluster:kubectl config use-context gardener-prow-trusted kubectl apply --server-side=true -f config/prow/cluster/prow_namespace.yaml
-
Create the
test-pods
namespace in the workload/build cluster:kubectl config use-context gardener-prow-build kubectl apply --server-side=true -f config/prow/cluster/base/test-pods_namespace.yaml
-
Create the required secrets (mainly in the prow cluster):
- the secrets for GCP service accounts can be created by our credentials rotation script
./hack/rotate-secrets.sh
. Please see Rotate credentials section for more details. github-app
(according to test-infra guide)github-token
(Personal Access Token for @gardener-ci-robot with scopespublic_repo, read:org, repo:status
, needs to be present in theprow
andtest-pods
namespace of the prow cluster)github-oauth-config
(according to test-infra guide)hmac-token
kubectl config use-context gardener-prow-trusted kubectl -n prow create secret generic hmac-token --from-literal=hmac=$(openssl rand -hex 20)
oauth-cookie-secret
kubectl config use-context gardener-prow-trusted kubectl -n prow create secret generic oauth-cookie-secret --from-literal=secret=$(openssl rand -base64 32)
kubeconfig
(ref test-infra guide, needs to be present in theprow
andtest-pods
namespace of the prow-trusted cluster)- add two contexts: the prow cluster as
gardener-prow-trusted
and the build/workload cluster asgardener-prow-build
gardener-prow-trusted
context should use the in-clusterServiceAccount
token and CA file, so that all Prow components are bound to their respective RBAC rolesgardener-prow-build
needs to be bound to thecluster-admin
role. The gencred utility can be used to easily create aServiceAccount
andClusterRoleBinding
and retrieve theServiceAccount
token.- Template:
apiVersion: v1 kind: Config current-context: gardener-prow-build # default cluster contexts: - name: gardener-prow-trusted context: cluster: gardener-prow-trusted user: gardener-prow-trusted-token - name: gardener-prow-build context: cluster: gardener-prow-build user: gardener-prow-build-token clusters: - name: gardener-prow-trusted cluster: # in-cluster config server: 'https://kubernetes.default.svc' certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt - name: gardener-prow-build cluster: server: <<workload-cluster-api-server-address>> certificate-authority-data: <<base64-encoded-CA-bundle>> users: - name: gardener-prow-trusted-token user: tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token # use in-cluster config - name: gardener-prow-build-token user: token: <<service-account-token-with-cluster-admin-permissions>> # generated via gencred
- add two contexts: the prow cluster as
slack-token
for theGardener Prow
Slack App in the Gardener Project workspace- follow the test-infra guide for setting up the Slack App
- this is used by crier to report job status changes (e.g., test failures) to dedicated Slack channels
- generally, failures/errors of
periodic
,postsubmit
andbatch
jobs are reported to#test-failures
- job status changes concerning the prow infrastructure itself (e.g., deploy and autobump jobs) are reported to
#prow-alerts
- this token is also used by hook (
slack
plugin) to post merge warnings to#prow-alerts
alertmanager-slack
: URLs for incoming webhooks for the #gardener-prow-alerts channel in the SAP Slack workspace- alertmanager instances in both clusters use an incoming webhook to post monitoring alerts to Slack
- different webhooks are used for the two instances
- for both clusters (prow-trusted and prow-build) do the following:
- follow https://api.slack.com/incoming-webhooks and set up a webhook for posting to
#gardener-prow-alerts
- create a secret in the
monitoring
namespace of the respective cluster with the Webhook URL under keyapi_url
- follow https://api.slack.com/incoming-webhooks and set up a webhook for posting to
grafana-admin
(admin user password)kubectl config use-context gardener-prow-trusted kubectl -n monitoring create secret generic grafana-admin --from-literal=admin_password=$(openssl rand -base64 32) kubectl config use-context gardener-prow-build kubectl -n monitoring create secret generic grafana-admin --from-literal=admin_password=$(openssl rand -base64 32)
- the secrets for GCP service accounts can be created by our credentials rotation script
-
Deploy Prow components. The initial deployment has to be done manually, later on changes to the components will be automatically deployed once merged into master.
./config/prow/deploy.sh
-
Bootstrap Prow configuration/jobs. This initial configuration has to be done manually, later on changes to configuration and jobs will be automatically applied by the
updateconfig
plugin once merged into master. The bootstrap tool does not work with the kubectl OIDC auth plugin. Thus, for the initial bootstrapping run, you will need a kubeconfig with a token for the trusted cluster../hack/boostrap-config.sh
The getting started guide in kubernetes/test-infra
is a good starting point for further investigations.
A monitoring stack based on kube-prometheus plus test-infra monitoring capabilities is installed in the prow clusters:
- prometheus-operator
- alertmanager (cluster with 3 replicas for HA)
- prometheus (2 replicas for HA)
- blackbox-exporter
- kube-state-metrics
- grafana
Alertmanager will send Slack alerts in #gardener-prow-alerts
in the SAP Slack workspace.
Grafana is available publicly at https://monitoring.prow.gardener.cloud (trusted cluster) and https://monitoring-build.prow.gardener.cloud (build cluster).
Service account tokens of the GCP service accounts we are using can be rotated using the ./hack/rotate-secrets.sh
script. It includes the service accounts.
- GCP infrastructure service account
- GCP storage service account
- Service account for gcr.io