Skip to content

Commit

Permalink
feat(engine): add default workflow limits (#466)
Browse files Browse the repository at this point in the history
  • Loading branch information
masontikhonov authored May 27, 2024
1 parent 6c83ddf commit 76f56af
Show file tree
Hide file tree
Showing 7 changed files with 66 additions and 8 deletions.
6 changes: 2 additions & 4 deletions charts/cf-runtime/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v2
description: A Helm chart for Codefresh Runner
name: cf-runtime
version: 6.3.28
version: 6.3.29
keywords:
- codefresh
- runner
Expand All @@ -17,10 +17,8 @@ annotations:
artifacthub.io/containsSecurityUpdates: "false"
# Supported kinds: `added`, `changed`, `deprecated`, `removed`, `fixed`, `security`:
artifacthub.io/changes: |
- kind: changed
description: Upgrade the engine to v1.170.0
- kind: added
description: Allows to force sequential pull of compose images by engine
description: Add default workflow limits.
dependencies:
- name: cf-common
repository: oci://quay.io/codefresh/charts
Expand Down
14 changes: 12 additions & 2 deletions charts/cf-runtime/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Codefresh Runner

![Version: 6.3.28](https://img.shields.io/badge/Version-6.3.28-informational?style=flat-square)
![Version: 6.3.29](https://img.shields.io/badge/Version-6.3.29-informational?style=flat-square)

Helm chart for deploying [Codefresh Runner](https://codefresh.io/docs/docs/installation/codefresh-runner/) to Kubernetes.

Expand Down Expand Up @@ -1034,7 +1034,7 @@ Go to [https://<YOUR_ONPREM_DOMAIN_HERE>/admin/runtime-environments/system](http
| runtime.dind.userVolumeMounts | object | `{}` | Add extra volume mounts |
| runtime.dind.userVolumes | object | `{}` | Add extra volumes |
| runtime.dindDaemon | object | See below | DinD pod daemon config |
| runtime.engine | object | `{"affinity":{},"command":["npm","run","start"],"env":{"CONTAINER_LOGGER_EXEC_CHECK_INTERVAL_MS":"1000","FORCE_COMPOSE_SERIAL_PULL":"false","LOGGER_LEVEL":"debug","LOG_OUTGOING_HTTP_REQUESTS":"false"},"image":{"registry":"quay.io","repository":"codefresh/engine","tag":"1.170.0"},"nodeSelector":{},"podAnnotations":{},"podLabels":{},"resources":{"limits":{"cpu":"1000m","memory":"2048Mi"},"requests":{"cpu":"100m","memory":"128Mi"}},"runtimeImages":{"COMPOSE_IMAGE":"quay.io/codefresh/compose:v2.20.3-1.4.0","CONTAINER_LOGGER_IMAGE":"quay.io/codefresh/cf-container-logger:1.10.3","CR_6177_FIXER":"quay.io/codefresh/alpine:edge","DOCKER_BUILDER_IMAGE":"quay.io/codefresh/cf-docker-builder:1.3.11","DOCKER_PULLER_IMAGE":"quay.io/codefresh/cf-docker-puller:8.0.17","DOCKER_PUSHER_IMAGE":"quay.io/codefresh/cf-docker-pusher:6.0.15","DOCKER_TAG_PUSHER_IMAGE":"quay.io/codefresh/cf-docker-tag-pusher:1.3.13","FS_OPS_IMAGE":"quay.io/codefresh/fs-ops:1.2.3","GC_BUILDER_IMAGE":"quay.io/codefresh/cf-gc-builder:0.5.3","GIT_CLONE_IMAGE":"quay.io/codefresh/cf-git-cloner:10.1.26","KUBE_DEPLOY":"quay.io/codefresh/cf-deploy-kubernetes:16.1.11","PIPELINE_DEBUGGER_IMAGE":"quay.io/codefresh/cf-debugger:1.3.0","TEMPLATE_ENGINE":"quay.io/codefresh/pikolo:0.14.0"},"schedulerName":"","serviceAccount":"codefresh-engine","tolerations":[],"userEnvVars":[]}` | Parameters for Engine pod (aka "pipeline" orchestrator). |
| runtime.engine | object | `{"affinity":{},"command":["npm","run","start"],"env":{"CONTAINER_LOGGER_EXEC_CHECK_INTERVAL_MS":"1000","FORCE_COMPOSE_SERIAL_PULL":"false","LOGGER_LEVEL":"debug","LOG_OUTGOING_HTTP_REQUESTS":"false"},"image":{"registry":"quay.io","repository":"codefresh/engine","tag":"1.170.0"},"nodeSelector":{},"podAnnotations":{},"podLabels":{},"resources":{"limits":{"cpu":"1000m","memory":"2048Mi"},"requests":{"cpu":"100m","memory":"128Mi"}},"runtimeImages":{"COMPOSE_IMAGE":"quay.io/codefresh/compose:v2.20.3-1.4.0","CONTAINER_LOGGER_IMAGE":"quay.io/codefresh/cf-container-logger:1.10.3","CR_6177_FIXER":"quay.io/codefresh/alpine:edge","DOCKER_BUILDER_IMAGE":"quay.io/codefresh/cf-docker-builder:1.3.11","DOCKER_PULLER_IMAGE":"quay.io/codefresh/cf-docker-puller:8.0.17","DOCKER_PUSHER_IMAGE":"quay.io/codefresh/cf-docker-pusher:6.0.15","DOCKER_TAG_PUSHER_IMAGE":"quay.io/codefresh/cf-docker-tag-pusher:1.3.13","FS_OPS_IMAGE":"quay.io/codefresh/fs-ops:1.2.3","GC_BUILDER_IMAGE":"quay.io/codefresh/cf-gc-builder:0.5.3","GIT_CLONE_IMAGE":"quay.io/codefresh/cf-git-cloner:10.1.26","KUBE_DEPLOY":"quay.io/codefresh/cf-deploy-kubernetes:16.1.11","PIPELINE_DEBUGGER_IMAGE":"quay.io/codefresh/cf-debugger:1.3.0","TEMPLATE_ENGINE":"quay.io/codefresh/pikolo:0.14.0"},"schedulerName":"","serviceAccount":"codefresh-engine","tolerations":[],"userEnvVars":[],"workflowLimits":{"MAXIMUM_ALLOWED_TIME_BEFORE_PRE_STEPS_SUCCESS":600,"MAXIMUM_ALLOWED_WORKFLOW_AGE_BEFORE_TERMINATION":86400,"MAXIMUM_ELECTED_STATE_AGE_ALLOWED":900,"MAXIMUM_RETRY_ATTEMPTS_ALLOWED":20,"MAXIMUM_TERMINATING_STATE_AGE_ALLOWED":900,"MAXIMUM_TERMINATING_STATE_AGE_ALLOWED_WITHOUT_UPDATE":300,"TIME_ENGINE_INACTIVE_UNTIL_TERMINATION":300,"TIME_ENGINE_INACTIVE_UNTIL_UNHEALTHY":60,"TIME_INACTIVE_UNTIL_TERMINATION":2700}}` | Parameters for Engine pod (aka "pipeline" orchestrator). |
| runtime.engine.affinity | object | `{}` | Set affinity |
| runtime.engine.command | list | `["npm","run","start"]` | Set container command. |
| runtime.engine.env | object | `{"CONTAINER_LOGGER_EXEC_CHECK_INTERVAL_MS":"1000","FORCE_COMPOSE_SERIAL_PULL":"false","LOGGER_LEVEL":"debug","LOG_OUTGOING_HTTP_REQUESTS":"false"}` | Set additional env vars. |
Expand All @@ -1052,6 +1052,16 @@ Go to [https://<YOUR_ONPREM_DOMAIN_HERE>/admin/runtime-environments/system](http
| runtime.engine.serviceAccount | string | `"codefresh-engine"` | Set service account for pod. |
| runtime.engine.tolerations | list | `[]` | Set tolerations. |
| runtime.engine.userEnvVars | list | `[]` | Set extra env vars |
| runtime.engine.workflowLimits | object | `{"MAXIMUM_ALLOWED_TIME_BEFORE_PRE_STEPS_SUCCESS":600,"MAXIMUM_ALLOWED_WORKFLOW_AGE_BEFORE_TERMINATION":86400,"MAXIMUM_ELECTED_STATE_AGE_ALLOWED":900,"MAXIMUM_RETRY_ATTEMPTS_ALLOWED":20,"MAXIMUM_TERMINATING_STATE_AGE_ALLOWED":900,"MAXIMUM_TERMINATING_STATE_AGE_ALLOWED_WITHOUT_UPDATE":300,"TIME_ENGINE_INACTIVE_UNTIL_TERMINATION":300,"TIME_ENGINE_INACTIVE_UNTIL_UNHEALTHY":60,"TIME_INACTIVE_UNTIL_TERMINATION":2700}` | Set workflow limits. |
| runtime.engine.workflowLimits.MAXIMUM_ALLOWED_TIME_BEFORE_PRE_STEPS_SUCCESS | int | `600` | Maximum time allowed to the engine to wait for the pre-steps (aka "Initializing Process") to succeed; seconds. |
| runtime.engine.workflowLimits.MAXIMUM_ALLOWED_WORKFLOW_AGE_BEFORE_TERMINATION | int | `86400` | Maximum time for workflow execution; seconds. |
| runtime.engine.workflowLimits.MAXIMUM_ELECTED_STATE_AGE_ALLOWED | int | `900` | Maximum time allowed to workflow to spend in "elected" state; seconds. |
| runtime.engine.workflowLimits.MAXIMUM_RETRY_ATTEMPTS_ALLOWED | int | `20` | Maximum retry attempts allowed for workflow. |
| runtime.engine.workflowLimits.MAXIMUM_TERMINATING_STATE_AGE_ALLOWED | int | `900` | Maximum time allowed to workflow to spend in "terminating" state until force terminated; seconds. |
| runtime.engine.workflowLimits.MAXIMUM_TERMINATING_STATE_AGE_ALLOWED_WITHOUT_UPDATE | int | `300` | Maximum time allowed to workflow to spend in "terminating" state without logs activity until force terminated; seconds. |
| runtime.engine.workflowLimits.TIME_ENGINE_INACTIVE_UNTIL_TERMINATION | int | `300` | Time since the last health check report after which workflow is terminated; seconds. |
| runtime.engine.workflowLimits.TIME_ENGINE_INACTIVE_UNTIL_UNHEALTHY | int | `60` | Time since the last health check report after which the engine is considered unhealthy; seconds. |
| runtime.engine.workflowLimits.TIME_INACTIVE_UNTIL_TERMINATION | int | `2700` | Time since the last workflow logs activity after which workflow is terminated; seconds. |
| runtime.gencerts | object | See below | Parameters for `gencerts-dind` post-upgrade/install hook |
| runtime.inCluster | bool | `true` | (for On-Premise only) Set inCluster runtime |
| runtime.patch | object | See below | Parameters for `runtime-patch` post-upgrade/install hook |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ runtimeScheduler:
userEnvVars: {{- toYaml . | nindent 4 }}
{{- end }}
{{- with $engineContext.workflowLimits }}
workflowLimits: {{ toYaml . | nindent 4 }}
workflowLimits: {{- toYaml . | nindent 4 }}
{{- end }}
cluster:
namespace: {{ .Release.Namespace }}
Expand Down Expand Up @@ -181,4 +181,4 @@ appProxy:
{{- if not .Values.runtime.agent }}
systemHybrid: true
{{- end }}
{{- end }}
{{- end }}
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,16 @@ tests:
KUBE_DEPLOY: "somedomain.io/codefresh/cf-deploy-kubernetes:tagoverride"
PIPELINE_DEBUGGER_IMAGE: "somedomain.io/codefresh/cf-debugger:tagoverride"
TEMPLATE_ENGINE: "somedomain.io/codefresh/pikolo:tagoverride"
workflowLimits:
MAXIMUM_ALLOWED_TIME_BEFORE_PRE_STEPS_SUCCESS: 600
MAXIMUM_ALLOWED_WORKFLOW_AGE_BEFORE_TERMINATION: 86400
MAXIMUM_ELECTED_STATE_AGE_ALLOWED: 900
MAXIMUM_RETRY_ATTEMPTS_ALLOWED: 20
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED: 900
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED_WITHOUT_UPDATE: 300
TIME_ENGINE_INACTIVE_UNTIL_TERMINATION: 300
TIME_ENGINE_INACTIVE_UNTIL_UNHEALTHY: 60
TIME_INACTIVE_UNTIL_TERMINATION: 2700
cluster:
namespace: codefresh
serviceAccount: codefresh-engine
Expand Down
10 changes: 10 additions & 0 deletions charts/cf-runtime/tests/runtime/runtime_onprem_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,16 @@ tests:
KUBE_DEPLOY: "quay.io/codefresh/cf-deploy-kubernetes:tagoverride"
PIPELINE_DEBUGGER_IMAGE: "quay.io/codefresh/cf-debugger:tagoverride"
TEMPLATE_ENGINE: "quay.io/codefresh/pikolo:tagoverride"
workflowLimits:
MAXIMUM_ALLOWED_TIME_BEFORE_PRE_STEPS_SUCCESS: 600
MAXIMUM_ALLOWED_WORKFLOW_AGE_BEFORE_TERMINATION: 86400
MAXIMUM_ELECTED_STATE_AGE_ALLOWED: 900
MAXIMUM_RETRY_ATTEMPTS_ALLOWED: 20
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED: 900
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED_WITHOUT_UPDATE: 300
TIME_ENGINE_INACTIVE_UNTIL_TERMINATION: 300
TIME_ENGINE_INACTIVE_UNTIL_UNHEALTHY: 60
TIME_INACTIVE_UNTIL_TERMINATION: 2700
cluster:
namespace: codefresh
serviceAccount: service-account-override
Expand Down
10 changes: 10 additions & 0 deletions charts/cf-runtime/tests/runtime/runtime_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,16 @@ tests:
secretKeyRef:
key: token
name: alice-secret
workflowLimits:
MAXIMUM_ALLOWED_TIME_BEFORE_PRE_STEPS_SUCCESS: 600
MAXIMUM_ALLOWED_WORKFLOW_AGE_BEFORE_TERMINATION: 86400
MAXIMUM_ELECTED_STATE_AGE_ALLOWED: 900
MAXIMUM_RETRY_ATTEMPTS_ALLOWED: 20
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED: 900
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED_WITHOUT_UPDATE: 300
TIME_ENGINE_INACTIVE_UNTIL_TERMINATION: 300
TIME_ENGINE_INACTIVE_UNTIL_UNHEALTHY: 60
TIME_INACTIVE_UNTIL_TERMINATION: 2700
cluster:
namespace: codefresh
serviceAccount: service-account-override
Expand Down
20 changes: 20 additions & 0 deletions charts/cf-runtime/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -540,6 +540,26 @@ runtime:
LOGGER_LEVEL: 'debug'
# -- Enable debug-level logging of outgoing HTTP/HTTPS requests
LOG_OUTGOING_HTTP_REQUESTS: 'false'
# -- Set workflow limits.
workflowLimits:
# -- Maximum time allowed to the engine to wait for the pre-steps (aka "Initializing Process") to succeed; seconds.
MAXIMUM_ALLOWED_TIME_BEFORE_PRE_STEPS_SUCCESS: 600
# -- Maximum time for workflow execution; seconds.
MAXIMUM_ALLOWED_WORKFLOW_AGE_BEFORE_TERMINATION: 86400
# -- Maximum time allowed to workflow to spend in "elected" state; seconds.
MAXIMUM_ELECTED_STATE_AGE_ALLOWED: 900
# -- Maximum retry attempts allowed for workflow.
MAXIMUM_RETRY_ATTEMPTS_ALLOWED: 20
# -- Maximum time allowed to workflow to spend in "terminating" state until force terminated; seconds.
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED: 900
# -- Maximum time allowed to workflow to spend in "terminating" state without logs activity until force terminated; seconds.
MAXIMUM_TERMINATING_STATE_AGE_ALLOWED_WITHOUT_UPDATE: 300
# -- Time since the last health check report after which workflow is terminated; seconds.
TIME_ENGINE_INACTIVE_UNTIL_TERMINATION: 300
# -- Time since the last health check report after which the engine is considered unhealthy; seconds.
TIME_ENGINE_INACTIVE_UNTIL_UNHEALTHY: 60
# -- Time since the last workflow logs activity after which workflow is terminated; seconds.
TIME_INACTIVE_UNTIL_TERMINATION: 2700
# -- Set pod annotations.
podAnnotations: {}
# -- Set pod labels.
Expand Down

0 comments on commit 76f56af

Please sign in to comment.