diff --git a/README.md b/README.md index 968a618..5b1e990 100644 --- a/README.md +++ b/README.md @@ -5,22 +5,23 @@ Hosts container images and wrapper for running scenarios supported by [Krkn](htt ### Supported chaos scenarios -Scenario | Description | Working -------------------------------------------- | --------------------------------------------------------------------------------------------- | -------------------- | -[Pod failures](docs/pod-scenarios.md) | Injects pod failures | :heavy_check_mark: | -[Container failures](docs/container-scenarios.md) | Injects container failures based on the provided kill signal | :heavy_check_mark: | -[Node failures](docs/node-scenarios.md) | Injects node failure through OpenShift/Kubernetes, cloud API's | :heavy_check_mark: | +Scenario | Description | Working +------------------------------------------- |------------------------------------------------------------------| -------------------- | +[Pod failures](docs/pod-scenarios.md) | Injects pod failures | :heavy_check_mark: | +[Container failures](docs/container-scenarios.md) | Injects container failures based on the provided kill signal | :heavy_check_mark: | +[Node failures](docs/node-scenarios.md) | Injects node failure through OpenShift/Kubernetes, cloud API's | :heavy_check_mark: | [zone outages](docs/zone-outages.md) | Creates zone outage to observe the impact on the cluster, applications | :heavy_check_mark: | -[time skew](docs/time-scenarios.md) | Skews the time and date | :heavy_check_mark: | -[Node cpu hog](docs/node-cpu-hog.md) | Hogs CPU on the targeted nodes | :heavy_check_mark: | -[Node memory hog](docs/node-memory-hog.md) | Hogs memory on the targeted nodes | :heavy_check_mark: | -[Node IO hog](docs/node-io-hog.md) | Hogs io on the targeted nodes | :heavy_check_mark: | -[Service Disruption](docs/service-disruption-scenarios.md) | Deleting all objects within a namespace | :heavy_check_mark: | +[time skew](docs/time-scenarios.md) | Skews the time and date | :heavy_check_mark: | +[Node cpu hog](docs/node-cpu-hog.md) | Hogs CPU on the targeted nodes | :heavy_check_mark: | +[Node memory hog](docs/node-memory-hog.md) | Hogs memory on the targeted nodes | :heavy_check_mark: | +[Node IO hog](docs/node-io-hog.md) | Hogs io on the targeted nodes | :heavy_check_mark: | +[Service Disruption](docs/service-disruption-scenarios.md) | Deleting all objects within a namespace | :heavy_check_mark: | [Application outages](docs/application-outages.md) | Isolates application Ingress/Egress traffic to observe the impact on dependent applications and recovery/initialization timing | :heavy_check_mark: | [Power Outages](docs/power-outages.md) | Shuts down the cluster for the specified duration and turns it back on to check the cluster health | :heavy_check_mark: | [PVC disk fill](docs/pvc-scenarios.md) | Fills up a given PersistenVolumeClaim by creating a temp file on the PVC from a pod associated with it | :heavy_check_mark: | [Network Chaos](docs/network-chaos.md) | Introduces network latency, packet loss, bandwidth restriction in the egress traffic of a Node's interface using tc and Netem | :heavy_check_mark: | -[Pod Network Chaos](docs/pod-network-chaos.md) | Introducs network chaos at pod level | :heavy_check_mark: | +[Pod Network Chaos](docs/pod-network-chaos.md) | Introduces network chaos at pod level | :heavy_check_mark: | +[Service Hijacking](docs/service-hijacking.md) | Hijacks a service http traffic to simulate custom HTTP responses | :heavy_check_mark: | ### Utilities diff --git a/common_run.sh b/common_run.sh index f699c48..7a09072 100755 --- a/common_run.sh +++ b/common_run.sh @@ -37,11 +37,22 @@ check_cluster_version() { kubectl get clusterversion || log "Not an OpenShift environment" } +# sets kubernetes distribution in krkn config if platform is kubernetes included automatically +# on all the builds to keep krkn compatible with both the platform +# called in the checks method below + +set_kubernetes_platform() { + if ! kubectl get clusterversion; + then + yq -i '.kraken.distribution="kubernetes"' /root/kraken/config/config.yaml.template + fi +} checks() { check_oc check_kubectl check_cluster_version + set_kubernetes_platform } # Config substitutions diff --git a/docker-compose.yaml b/docker-compose.yaml index 0880b2f..8d56493 100644 --- a/docker-compose.yaml +++ b/docker-compose.yaml @@ -85,3 +85,8 @@ services: context: ./ dockerfile: ./chaos-recommender/Dockerfile image: quay.io/krkn-chaos/krkn-hub:chaos-recommender + service-hijacking: + build: + context: ./ + dockerfile: ./service-hijacking/Dockerfile + image: quay.io/krkn-chaos/krkn-hub:service-hijacking diff --git a/docs/service-hijacking.md b/docs/service-hijacking.md new file mode 100644 index 0000000..a920a9c --- /dev/null +++ b/docs/service-hijacking.md @@ -0,0 +1,66 @@ +### Service Hijacking scenario +This scenario reroutes traffic intended for a target service to a custom web service that is automatically deployed by Krkn. +This web service responds with user-defined HTTP statuses, MIME types, and bodies. +For more details, please refer to the following [documentation](https://github.com/krkn-chaos/krkn/blob/main/docs/service_hijacking_scenarios.md). + +#### Run +Unlike other krkn-hub scenarios, this one requires a specific configuration due to its unique structure. +You must set up the scenario in a local file following the [scenario syntax](https://github.com/krkn-chaos/krkn/blob/main/scenarios/kube/service_hijacking.yaml), +and then pass this file's base64-encoded content to the container via the SCENARIO_BASE64 variable. + +If enabling [Cerberus](https://github.com/krkn-chaos/krkn#kraken-scenario-passfail-criteria-and-report) to monitor the cluster and pass/fail the scenario post chaos, refer [docs](https://github.com/redhat-chaos/krkn-hub/tree/main/docs/cerberus.md). +Make sure to start it before injecting the chaos and set `CERBERUS_ENABLED` +environment variable for the chaos injection container to autoconnect. + +``` +$ podman run --name= \ + -e SCENARIO_BASE64="$(base64 -w0 )" \ + -v :/root/.kube/config:Z quay.io/krkn-chaos/krkn-hub:service-hijacking + +$ podman logs -f # Streams Kraken logs +$ podman inspect --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario +``` + +``` +$ export SCENARIO_BASE64="$(base64 -w0 )" +$ docker run $(./get_docker_params.sh) --name= \ + --net=host \ + -v :/root/.kube/config:Z \ + -d quay.io/krkn-chaos/krkn-hub:service-hijacking +OR +$ docker run --name= -e SCENARIO_BASE64="$(base64 -w0 )" \ + --net=host \ + -v :/root/.kube/config:Z \ + -d quay.io/krkn-chaos/krkn-hub:service-hijacking + +$ docker logs -f # Streams Kraken logs +$ docker inspect --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario +``` + + +#### Supported parameters + +The following environment variables can be set on the host running the container to tweak the scenario/faults being injected: + +ex.) +`export =` + +See list of variables that apply to all scenarios [here](all_scenarios_env.md) that can be used/set in addition to these scenario specific variables + +| Parameter | Description | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| SCENARIO_BASE64 | Base64 encoded service-hijacking scenario file. Note that the __-w0__ option in the command substitution `SCENARIO_BASE64="$(base64 -w0 )"` is __mandatory__ in order to remove line breaks from the base64 command output | + + +**NOTE** In case of using custom metrics profile or alerts profile when `CAPTURE_METRICS` or `ENABLE_ALERTS` is enabled, mount the metrics profile from the host on which the container is run using podman/docker under `/root/kraken/config/metrics-aggregated.yaml` and `/root/kraken/config/alerts`. For example: +``` +$ podman run -e SCENARIO_BASE64="$(base64 -w0 )" \ + --name= \ + --net=host \ + --env-host=true \ + -v :/root/kraken/config/metrics-aggregated.yaml \ + -v :/root/kraken/config/alerts \ + -v :/root/.kube/config:Z \ + -d quay.io/krkn-chaos/krkn-hub:service-hijacking +``` + diff --git a/service-hijacking/Dockerfile b/service-hijacking/Dockerfile new file mode 100644 index 0000000..2eb9081 --- /dev/null +++ b/service-hijacking/Dockerfile @@ -0,0 +1,23 @@ +# Dockerfile for kraken + +FROM quay.io/krkn-chaos/krkn:latest + +ENV KUBECONFIG /root/.kube/config + +# Install dependencies +RUN yum install -y which +RUN pip install jsonschema + +# Copy configurations +COPY config.yaml.template /root/kraken/config/config.yaml.template +COPY service-hijacking/env.sh /root/env.sh +COPY service-hijacking/run.sh /root/run.sh +COPY env.sh /root/main_env.sh + + +COPY service-hijacking/config-schema.json /root/kraken/scenarios/service-hijacking-schema.json +COPY service-hijacking/validate_config.py /root/validate_config.py + +COPY common_run.sh /root/common_run.sh + +ENTRYPOINT /root/run.sh diff --git a/service-hijacking/README.md b/service-hijacking/README.md new file mode 100644 index 0000000..b758792 --- /dev/null +++ b/service-hijacking/README.md @@ -0,0 +1 @@ +See [doc](https://github.com/redhat-chaos/krkn-hub/blob/main/docs/service-hijacking.md) for how to run and all the variables listed \ No newline at end of file diff --git a/service-hijacking/config-schema.json b/service-hijacking/config-schema.json new file mode 100644 index 0000000..423cee2 --- /dev/null +++ b/service-hijacking/config-schema.json @@ -0,0 +1,77 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "properties": { + "service_target_port": { + "oneOf": [ + { + "type": "string" + }, + { + "type": "integer" + } + ] + }, + "service_name": { + "type": "string" + }, + "service_namespace": { + "type": "string" + }, + "image": { + "type": "string" + }, + "chaos_duration": { + "type": "integer" + }, + "plan": { + "type": "array", + "minItems": 1, + "items": { + "type": "object", + "properties": { + "resource": { + "type": "string" + }, + "steps": { + "type": "object", + "patternProperties": { + "^[A-Z]+$": { + "type": "array", + "minItems": 1, + "items": { + "type": "object", + "properties": { + "duration": { + "type": "integer" + }, + "status": { + "type": "integer" + }, + "mime_type": { + "type": "string" + }, + "payload": { + "type": "string" + } + }, + "required": ["duration", "status", "mime_type", "payload"] + } + } + }, + "additionalProperties": false + } + }, + "required": ["resource", "steps"] + } + } + }, + "required": [ + "service_target_port", + "service_name", + "service_namespace", + "image", + "chaos_duration", + "plan" + ] +} diff --git a/service-hijacking/env.sh b/service-hijacking/env.sh new file mode 100755 index 0000000..22771e1 --- /dev/null +++ b/service-hijacking/env.sh @@ -0,0 +1,7 @@ +#!/bin/bash + +# Vars and respective defaults +export SCENARIO_BASE64=${SCENARIO_BASE64:=1} +export SCENARIO_TYPE="service_hijacking" +export SCENARIO_FILE="scenarios/service_hijacking.yaml" +export SCENARIO_POST_ACTION=${SCENARIO_POST_ACTION:=""} diff --git a/service-hijacking/run.sh b/service-hijacking/run.sh new file mode 100755 index 0000000..686da1a --- /dev/null +++ b/service-hijacking/run.sh @@ -0,0 +1,41 @@ +#!/bin/bash + +set -ex + +# Source env.sh to read all the vars +source /root/main_env.sh +source /root/env.sh + +source /root/common_run.sh +checks + +# check if SCENARIO_BASE64 is set + +[ $SCENARIO_BASE64 == 1 ] && \ +( echo "[ERROR] please set SCENARIO_BASE64 variable with a valid base64 encoded hijacking scenario +eg. podman run -e SCENARIO_BASE64=\$(base64 -w0 ~/krkn/scenarios/kube/service_hijacking.yaml) [...] " && \ +exit 1 ) + + +# Substitute config with environment vars defined +echo $SCENARIO_BASE64 | base64 -d >> /root/kraken/scenarios/service_hijacking.yaml || \ +(echo -e "[ERROR] Unable to decode SCENARIO_BASE64, bad base64 format please refer to documentation" \ +&& exit 1) + +# Validate scenario against schema + +python3.9 /root/validate_config.py -y /root/kraken/scenarios/service_hijacking.yaml \ + -s /root/kraken/scenarios/service-hijacking-schema.json + + +# replace env variables + +envsubst < /root/kraken/config/config.yaml.template > /root/kraken/config/service_hijacking_config.yaml + +# Run Kraken +cd /root/kraken + +cat scenarios/service_hijacking.yaml +cat config/service_hijacking_config.yaml + +python3.9 run_kraken.py --config=config/service_hijacking_config.yaml diff --git a/service-hijacking/validate_config.py b/service-hijacking/validate_config.py new file mode 100644 index 0000000..5df7702 --- /dev/null +++ b/service-hijacking/validate_config.py @@ -0,0 +1,37 @@ +import os.path +import sys + +from jsonschema import validate, ValidationError +import yaml +import argparse + +parser= argparse.ArgumentParser(description="python validate_config.py -y input.yaml -j schema.json") +required_args = parser.add_argument_group('Required arguments') +required_args.add_argument("-y", "--yaml", help="YAML file to validate", required=True) +required_args.add_argument("-s", "--schema", help="JSON schema used to validate the YAML file", required=True) +args = parser.parse_args() + +if not os.path.exists(args.yaml): + print(f"[ERROR] file not found: {args.yaml}") + sys.exit(1) +if not os.path.exists(args.schema): + print(f"[ERROR] file not found: {args.schema}") + sys.exit(1) +try: + with open(args.yaml) as stream: + yaml_file = yaml.safe_load(stream) + with open(args.schema) as stream: + schema = yaml.safe_load(stream) + + validate(yaml_file, schema) + print("[SUCCESS] scenario configuration successfully validated") + sys.exit(0) +except ValidationError as e: + print("[ERROR] Bad configuration file, please refer to the Krkn Documentation https://github.com/krkn-chaos/krkn/blob/main/docs/service_hijacking_scenarios.md") + print(str(e)) + sys.exit(1) +except Exception as e: + print(f"[ERROR] Failed to validate file with exception: {str(e)}") + sys.exit(1) + +