diff --git a/CHANGELOG.md b/CHANGELOG.md index d747104..3db9e90 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,302 +1,10 @@ # Changelog -## [Unreleased][] - -[Unreleased]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.20.0...HEAD - -- Try to read proxy from environment variable "HTTP_PROXY" and set it - appropriately if it exists -- Add the `deployment_is_fully_available` probe to wait for a deployment to be fully available [#38][38] -- Fix calls to `delete_namespaced_*` so that the `body` argument is passed - a named argument [#42][42]. A follow up to [#34][34] -- Fix calls to `delete_nodes` so that the `body` argument is passed - a named argument [#44][44]. - -[38]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/pull/38 -[42]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/issues/42 -[44]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/pull/44 - -## [0.20.0][] - 2018-03-25 - -[0.20.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.19.1...0.20.0 - -### Added - -- Add a probe to check pods conditions [PR#31][31] - -[31]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/pull/31 - -### Changed - -- Fix call to `delete_namespaced_pod` so that the `body` argument is passed - a named argument [#34][34] - -[34]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/issues/34 - - -## [0.19.1][] - 2018-10-08 - -[0.19.1]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.18.1...0.19.1 - -### Changed - -- As part of [#21][21], we realized that passing `None` to some parameters of the - Kubernetes client API was not the right move because, in that case, the client - turns that into a `"None"` string which is not what we want. So I had to - resort to many conditionals that make the code not as clean I'd want. Sigh! - -[21]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/issues/21 - -## [0.18.1][] - 2018-10-08 - -[0.18.1]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.18.0...0.18.1 - -### Changed - -- Fix: use `Succeeded` instead of `Completed` to filter successful pods created by a cronjob in the `all_microservices_healthy` probe. - -## [0.18.0][] - 2018-10-08 - -[0.18.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.17.0...0.18.0 - -### Added - -- [Codecov][codecov] integration -- Renamed `FailedActivity` to `ActivityFailed` as per [chaostoolkit 0.20.0][0.20.0]. See [PR#20][20] -- Add Ability to specify a maximum percentage of pods to be killed [PR#19][19] -- Consider `Completed` pods as healthy in the `all_microservices_healthy` probe. See [PR#23][23] -- Support a new `grace_period_seconds` parameter in the `terminate_pods` action. See [PR#24][24] - -[codecov]: https://codecov.io/gh/chaostoolkit/chaostoolkit-kubernetes -[0.20.0]: https://github.com/chaostoolkit/chaostoolkit-lib/blob/master/CHANGELOG.md#0200---2018-08-09 -[20]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/pull/20 -[19]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/pull/19 - -## [0.17.0][] - 2018-09-07 - -[0.17.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.16.2...0.17.0 - -### Added - -- List work nodes - -## [0.16.2][] - 2018-05-14 - -[0.16.2]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.16.1...0.16.2 - -### Added - -- Read version from source file without importing - -## [0.16.1][] - 2018-05-14 - -[0.16.1]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.16.0...0.16.1 - -### Added - -- Added requirements-dev.txt to MANIFEST.in so it gets packaged and distributed - -## [0.16.0][] - 2018-04-24 - -[0.16.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.15.0...0.16.0 - -### Added - -- Allow to pass the Kubernetes context to authenticate from by setting - the `"KUBERNETES_CONTEXT"` key in the environment or the secrets object - [#15][15] - -[15]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/issues/15 - -## [0.15.0][] - 2018-04-13 - -[0.15.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.14.0...0.15.0 - -### Added - -- a probe to count the number of pods -- actions to delete and create nodes -- actions to cordon, uncordon and drain nodes -- canot locate credentials automatically when ran from within a Pod if - you set the `CHAOSTOOLKIT_IN_POD: "true"` environment variable in the Pod - spec - -## [0.14.0][] - 2018-04-05 - -[0.14.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.13.0...0.14.0 - -### Added - -- allow to create a Kubernetes client from a Kubernetes cluster pod - -## [0.13.0][] - 2018-03-09 - -[0.13.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.12.0...0.13.0 - -### Added - -- `chaosk8s.pod.probes.pods_in_phase` to probe that all pods matching a label - are in a given pod Phase -- `chaosk8s.pod.probes.pods_not_in_phase` to probe that all pods matching a - label are not in a given pod Phase - -## [0.12.0][] - 2018-02-12 - -[0.12.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.11.0...0.12.0 - -### Changed - -- Moved the `chaosk8s.probes.read_microservice_logs` to - `chaosk8s.pod.probes.read_pod_logs` for clarity -- Make name optional for `chaosk8s.pod.probes.read_pod_logs` as it usually - more preferred to use a label for that probe -- Removed the system discovery as it wasn't used by chaostoolkit anyway - -## [0.11.0][] - 2018-01-28 - -[0.11.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.10.0...0.11.0 - -### Added - -- Added a pod specific set of actions - -### Changed - -- Refactor dev/test dependencies so they are not deployed on install - -## [0.10.0][] - 2018-01-22 - -[0.10.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.9.0...0.10.0 - -### Changed - -- activities now take a `label_selector` argument to let you adjust to your - conventions when selecting resources [#7][7] - -[7]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/issues/7 - -## [0.9.0][] - 2018-01-16 - -[0.9.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.8.0...0.9.0 - -### Added - -- discovery mechanism - -## [0.8.0][] - 2017-12-29 - -[0.8.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.7.0...0.8.0 - -### Added - -- `read_microservices_logs` probe to fetch pod's logs - -## [0.7.0][] - 2017-12-17 - -[0.7.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.6.0...0.7.0 - -### Added - -- Deployment scaler action - -### Changed - -- Updated to chaostoolkit-lib 0.8.0 - -## [0.6.0][] - 2017-12-12 - -[0.6.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.5.0...0.6.0 - -### Added - -- Logging at DEBUG level for investigation -- Probe `deployment_is_not_fully_available` to wait until a deployment is not - fully available (its desired state is different from its current state) - -### Changed - -- Selecting on the name Label rather than Service as it's more commonly used -- Updated chaostoolkit-lib to 0.7.0 for configuration support - -## [0.5.0][] - 2017-12-06 - -[0.5.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.4.3...0.5.0 - -### Changed - -- Updated to match chaostoolkit-lib 0.6.0 API changes -- Probes now return `True` on success so they can be nicely used from the - steady state hypothesis checks - -## [0.4.3][] - 2017-11-23 - -[0.4.3]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.4.2...0.4.3 - -### Changed - -- Removing unwanted parameter - -## [0.4.2][] - 2017-11-20 - -[0.4.2]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.4.1...0.4.2 - -### Changed - -- Proper verify SSL reading of the environment key - -## [0.4.1][] - 2017-11-20 - -[0.4.1]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.4.0...0.4.1 - -### Added - -- Passing secrets down to client function - - -## [0.4.0][] - 2017-11-20 - -[0.4.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.3.0...0.4.0 - -### Added - -- Can create a client from secrets - - -## [0.3.0][] - 2017-11-20 - -[0.3.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.2.0...0.3.0 - -### Added - -- Can now authenticate to the Kubernetes API endpoint either via a token, - username/password or certificate/pkey. All of this via environment variable. - By default, still looks up for ~/kube/config if it exists - - -## [0.2.0][] - 2017-10-23 - -[0.2.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.1.1...0.2.0 - -### Added - -- Remove check Kubernetes service by name - -### Changed - -- Do not build a universal wheel package (no Python 2 support in chaostoolkit) - -## [0.1.1][] - 2017-10-06 - -[0.1.1]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/compare/0.1.0...0.1.1 - -### Changed - -- Package up extra files when installed from source - -## [0.1.0][] - 2017-10-06 - -[0.1.0]: https://github.com/chaostoolkit/chaostoolkit-kubernetes/tree/0.1.0 - -### Added - -- Initial release +## [Unreleased] + +## [0.0.1]: 2019-08-23 +#### Added +- Support for checking +- Support to killing spark executor for a spark application. +- Support for killing spark driver for a spark application. +- Support for killing spark driver and exectutor for a spark application. \ No newline at end of file diff --git a/README.md b/README.md index c168c79..2fb7423 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,345 @@ + +# chaosToolkit-nimble (Guavus Chaos Test automation framework) + +- [ChaosToolkit Overview](#ChaosToolkit Overview) +- [Jio Use Cases Implemented](#Jio Use Cases Implemented) +- [Installation](#Installation of chaostoolkit-nimble on the local system (MAC)) +- [Resolving dependency issues on local system (MAC)](#Resolving dependency issues on the local system (MAC)) + + +## ChaosToolkit Overview +* The Chaostoolkit is an open source third party tool that enables you to run chaos engineering experiments seamlessly against applications and infrastructure components to assess resiliency and fault tolerance. + +* With use of Chaostoolkit you may learn how to change the system accordingly and make it more resilient on multiple levels like application, network and platform. + +* `Chaostoolkit-nimble` is built on top of `nimble` and `chaostoolkit-kubernetes` which is a kubernetes-specific chaos extension. + +### The Various Sections of an Experiment +##### `Controls` +``` +*Here you declare the the control module, which is simply a set of functions that are called by the Chaos Toolkit when executing the experiment. + +*The Controls are applied per experiment. +https://docs.chaostoolkit.org/reference/extending/create-control-extension/ +``` +##### `The steady state hypothesis` +``` +*The steady state hypothesis declares the various probes that will be applied as part of the hypothesis check. + +*The hypothesis is played twice. The first time before we do anything else to ensure the system is indeed in a normal state, +The second time the hypothesis is applied is after the conditions were changed in the system, to validate it is still in a normal state. + +*Hypothesis probes expect a tolerance property which tells the Chaos Toolkit how to validate a certain aspect of the state +``` + +##### `Method` +``` +*The method is the anomaly injection block which changes the conditions of our system/application. + +* This section is executed only if Hypothesis (above) is successfully met, else this section would be skipped. +``` +##### `Rollbacks` +``` +*Finally, the rollback section (which is optional) tries to remediate to the changes we made on/off the system during the anomaly injection. + +*This block will be executed always irrespective of the fact that Hypothesis was met or not in the first time. +``` + +#### `Different Types of Activities` +###### `Probe` +``` +* A probe is a way of detecting a particular set of conditions in the system that is undergoing experimentation. +* Hypothesis uses probes only and the method may or may not use them. +``` + +###### `Action` +``` +* An action is a particular activity that needs to be enacted on the system under experimentation. +* Rollbacks are made of actions only and the method also use actions. +``` + +#### `Sample Experiment json file` +``` +{ + "version": "1.0.0", + "title": "What is the impact of an expired certificate on our application chain?", + "description": "If a certificate expires, we should gracefully deal with the issue.", + "tags": ["tls"], + "controls": [{ + "name": "spark-related-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.spark.control" + } + }], + "steady-state-hypothesis": { + "title": "Application responds", + "probes": [{ + "type": "probe", + "name": "the-astre-service-must-be-running", + "tolerance": true, + "provider": { + "type": "python", + "module": "os.path", + "func": "exists", + "arguments": { + "path": "astre.pid" + } + } + }, + { + "type": "probe", + "name": "the-sunset-service-must-be-running", + "tolerance": true, + "provider": { + "type": "python", + "module": "os.path", + "func": "exists", + "arguments": { + "path": "sunset.pid" + } + } + }, + { + "type": "probe", + "name": "we-can-request-sunset", + "tolerance": 200, + "provider": { + "type": "http", + "timeout": 3, + "verify_tls": false, + "url": "https://localhost:8443/city/Paris" + } + } + ] + }, + "method": [{ + "type": "action", + "name": "swap-to-expired-cert", + "provider": { + "type": "process", + "path": "cp", + "arguments": "expired-cert.pem cert.pem" + } + }, + { + "type": "probe", + "name": "read-tls-cert-expiry-date", + "provider": { + "type": "process", + "path": "openssl", + "arguments": "x509 -enddate -noout -in cert.pem" + } + }, + { + "type": "action", + "name": "restart-astre-service-to-pick-up-certificate", + "provider": { + "type": "process", + "path": "pkill", + "arguments": "--echo -HUP -F astre.pid" + } + }, + { + "type": "action", + "name": "restart-sunset-service-to-pick-up-certificate", + "provider": { + "type": "process", + "path": "pkill", + "arguments": "--echo -HUP -F sunset.pid" + }, + "pauses": { + "after": 1 + } + } + ], + "rollbacks": [{ + "type": "action", + "name": "swap-to-vald-cert", + "provider": { + "type": "process", + "path": "cp", + "arguments": "valid-cert.pem cert.pem" + } + }, + { + "ref": "restart-astre-service-to-pick-up-certificate" + }, + { + "ref": "restart-sunset-service-to-pick-up-certificate" + } + ] +} +``` + + +## Jio Use Cases Implemented +Job Name : Media Plane + +Job frequency : 15min + +Number of job instances being run: 1 + +Assumption : Job is already running on the cluster. + +** NOTE: No custom code required here by the user. These three use cases (i.e experiments) have been templatized and these templates have been stored on fileserver at location: +`http://192.168.192.201/guavus/automation/chaos/exp_templates/spark/` +You need to provide this template path as an input to run your chaos experiments. + + +### Use case 1: Kill n number of spark executors for a spark job running on yarn and validate data for that job instance. +``` +Chaos Experiment Template path (exp_template_file) = "automation/chaos/exp_templates/spark/executor_kill_exp.json" + +------Before experiment control: +Read the user given testbed and initialize nimble `node_obj` object. + +------Hypothesis section: +Check job is running on yarn + +------Method section (Anomaly injection): +Kill spark job any active executors for the last spark driver attempt. + +------After experiment control: +Wait for the job to complete on yarn and then fetch the job total execution time from yarn. (Time fetched: 1.33 minutes) + + +User inputs required: +* Testbed config yaml +* Validation config yaml +* Chaos Experiment Template path: +* Num of executors to kill. Default is 1. + +Pytest command: +python -m pytest -k "test_chaos_on_executor_kill or test_data_validation" --testbed=chaostoolkit_nimble/resources/testbeds/open_nebula_135_35.yml --componentAttributesConfig=chaostoolkit_nimble/resources/components/component_attributes_kerberos.yml --validationConfig=chaostoolkit_nimble/resources/validation/sample_validation_config.yml chaostoolkit_nimble/tests/sample/test_jio_spark_job.py +``` + +### Use case 2: Kill the spark driver for a spark job running on yarn and validate data for that job instance. +``` +Chaos Experiment Template path (exp_template_file) = "automation/chaos/exp_templates/spark/driver_kill_exp.json" + +------Before experiment control: +Read the user given testbed and initialize nimble `node_obj` object. + +------Hypothesis section: +Check job is running on yarn + +------Method section (Anomaly injection): +Kill the spark driver for this spark job. + +------After experiment control: +Wait for the job to complete on yarn and then fetch the job total execution time from yarn. (Time fetched: 1.74 minutes) + + +User inputs required: +* Testbed config yaml +* Validation config yaml +* Chaos Experiment Template path: + +Pytest command: +python -m pytest -k "not(test_chaos_on_executor_kill or test_chaos_on_driver_and_executor_kill)" --testbed=chaostoolkit_nimble/resources/testbeds/open_nebula_135_35.yml --componentAttributesConfig=chaostoolkit_nimble/resources/components/component_attributes_kerberos.yml --validationConfig=chaostoolkit_nimble/resources/validation/sample_validation_config.yml chaostoolkit_nimble/tests/sample/test_jio_spark_job.py + +``` + +### Use case 3: Kill the driver and n number of executors for a spark job running on yarn and validate data for that job instance. +``` +Chaos Experiment Template path (exp_template_file) = "automation/chaos/exp_templates/spark/driver_and_executor_kill_exp.json" + +------Before experiment control: +Read the user given testbed and initialize nimble `node_obj` object. + +------Hypothesis section: +Check job is running on yarn + +------Method section (Anomaly injection): +Kill the spark driver for this spark job and then kill any active executors for the new spark attempt. + +------After experiment control: +Wait for the job to complete on yarn and then fetch the job total execution time from yarn. (Time fetched 1.76 minutes: ) + + +User inputs required: +* Testbed config yaml +* Validation config yaml +* Chaos Experiment Template path: +* Num of executors to kill. Default is 1. + +Pytest command: +python -m pytest -k "test_chaos_on_driver_and_executor_kill or test_data_validation" --testbed=chaostoolkit_nimble/resources/testbeds/open_nebula_135_35.yml --componentAttributesConfig=chaostoolkit_nimble/resources/components/component_attributes_kerberos.yml --validationConfig=chaostoolkit_nimble/resources/validation/sample_validation_config.yml chaostoolkit_nimble/tests/sample/test_jio_spark_job.py + +``` + +## Installation of chaostoolkit-nimble on the local system (MAC) + +`Assumptions` : + +* Python 3 is already installed on the system. + +* Automation code from your own solution repo is already checked out on the system. + +##### 1. Install chaostoolkit-nimble package in a virtualenv +``` +1.1 cd ../ ; mkdir chaos_virenv ; cd chaos_virenv +1.2 virtualenv --python=python3 venv +1.3 source venv/bin/activate +1.4 Remove nimble and add chaostoolkit-nimble in your requirements.txt +1.5 Install chaostoolkit-nimble in their virtualenv using command: +1.6 pip install -r --extra-index-url http://192.168.192.201:5050/simple/ --trusted-host 192.168.192.201 +``` + +##### 2. Add this virtual env in pycharm +``` +Pycharm --> Preferences --> Project interpreter --> settings --> show all --> add the chaos_virenv +``` + +##### 3. Post installation changes + +3.1 Make sure the testbed file name follow the nomenclature `open_nebula_*`. If not then remane it accoridingly. +3.2 Add the chaos test case in the corresponding job's test file. +3.3 Update conftest.py with below piece of code + +``` +parser.addoption("--experimentsPath", + help="Relative path (to the project root) of the file containing chaos experiment json files. E.g. python -m pytest --validationConfig=resources/validation/chaos_exp_config.yml") +``` + +``` +@pytest.fixture(scope="session", autouse=True) +def initialize_node_obj(request): + testbed_file = request.config.getoption("--testbed") + component_arttributes_file = request.config.getoption("--componentAttributesConfig") + if not component_arttributes_file: + component_arttributes_file = "nimble/resources/components/component_attributes.yml" + setup_files_base_path = "%s/setup" % global_constants.DEFAULT_LOCAL_TMP_PATH + if testbed_file: + NodeManager.initialize(testbed_file, component_arttributes_file) + ShellUtils.execute_shell_command( + ShellUtils.remove_and_create_directory(setup_files_base_path)) + testbed_file_tmp_path = "%s/%s" % (setup_files_base_path, testbed_file.rsplit("/", 1)[1]) + component_arttributes_file_tmp_path = "%s/%s" % ( + setup_files_base_path, component_arttributes_file.rsplit("/", 1)[1]) + ShellUtils.execute_shell_command(ShellUtils.copy(testbed_file, testbed_file_tmp_path)) + ShellUtils.execute_shell_command( + ShellUtils.copy(component_arttributes_file, component_arttributes_file_tmp_path)) + yield + ShellUtils.execute_shell_command(ShellUtils.remove(setup_files_base_path, recursive=True)) +``` + +## Resolving dependency issues on local system (MAC) +* `Install python 3 using below command` +``` +brew install python3 +``` +* `Chaos html report generation issue` +``` +pip install cairocffi --- already satisfied +brew uninstall py2cairo --- this will not install properly but one of its dependencies will get installed successfully. 'i.e' "cairo" +export PKG_CONFIG_PATH="/usr/local/opt/libffi/lib/pkgconfig" +pip install pycairo +brew install pandoc +``` + +``` # Chaos Toolkit Kubernetes Support [![Build Status](https://travis-ci.org/chaostoolkit/chaostoolkit-kubernetes.svg?branch=master)](https://travis-ci.org/chaostoolkit/chaostoolkit-kubernetes) @@ -153,3 +495,5 @@ into the master branch of the repository. Please, make sure you can abide by the rules of the DCO before submitting a PR. [dco]: https://github.com/probot/dco#how-it-works + + diff --git a/chaosk8s/node/__init__.py b/chaostoolkit_nimble/__init__.py similarity index 100% rename from chaosk8s/node/__init__.py rename to chaostoolkit_nimble/__init__.py diff --git a/chaosk8s/pod/__init__.py b/chaostoolkit_nimble/actions/__init__.py similarity index 100% rename from chaosk8s/pod/__init__.py rename to chaostoolkit_nimble/actions/__init__.py diff --git a/chaostoolkit_nimble/actions/base/__init__.py b/chaostoolkit_nimble/actions/base/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/actions/base/flows/__init__.py b/chaostoolkit_nimble/actions/base/flows/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/actions/base/flows/chaos_user_actions.py b/chaostoolkit_nimble/actions/base/flows/chaos_user_actions.py new file mode 100644 index 0000000..d719b34 --- /dev/null +++ b/chaostoolkit_nimble/actions/base/flows/chaos_user_actions.py @@ -0,0 +1,53 @@ +import logging +import re + +import allure +import jinja2 + +from nimble.core import global_constants +from nimble.core.utils.file_server_utils import FileServerUtils +from nimble.core.utils.shell_utils import ShellUtils + +_LOGGER = logging.getLogger(__name__) + +EXPERIMENTS_BASE_PATH = "%s/tmp/experiments/" % global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH +ShellUtils.execute_shell_command(ShellUtils.remove_and_create_directory(EXPERIMENTS_BASE_PATH)) + + +def run_experiment(exp_file=None, exp_template_file=None, context=None): + status = None + journal_path = "%s/journal.json" % global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH + if exp_file: + ShellUtils.execute_shell_command(ShellUtils.copy(exp_file, EXPERIMENTS_BASE_PATH)) + else: + render_template(exp_template_file, context) + experiment_file_response = ShellUtils.execute_shell_command( + ShellUtils.find_files_in_directory(EXPERIMENTS_BASE_PATH)) + for experiment_file in experiment_file_response.stdout.strip().split("\n"): + response = ShellUtils.execute_shell_command("chaos run --journal-path %s %s" % (journal_path, experiment_file)) + status = re.search(r'.*Experiment\sended\swith\sstatus:\s(.*)', response.stderr).group(1) + html_report_path = generate_html(journal_path) + allure.attach.file(html_report_path, name='Chaos experiment html report', + attachment_type=allure.attachment_type.HTML) + assert status == "completed" + + +def render_template(exp_template_file, context): + file_server_utils = FileServerUtils() + exp_file_name = exp_template_file.rsplit("/", 1)[1] + template_base_dir = "%s/tmp/exp_templates/" % global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH + ShellUtils.execute_shell_command(ShellUtils.remove_and_create_directory(template_base_dir)) + file_server_utils.download(exp_template_file, path_to_download=template_base_dir) + templateLoader = jinja2.FileSystemLoader(searchpath=template_base_dir) + templateEnv = jinja2.Environment(loader=templateLoader) + template = templateEnv.get_template(exp_file_name) + _LOGGER.info('Rendering from template: %s' % template.name) + template.stream(context).dump('%s/%s' % (EXPERIMENTS_BASE_PATH, exp_file_name)) + + +def generate_html(journal_path): + html_report_path = "%s/chaos_report.html" % global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH + command = "export LC_ALL=en_US.UTF-8 && chaos report --export-format=html5 %s %s" % ( + journal_path, html_report_path) + ShellUtils.execute_shell_command(command) + return html_report_path diff --git a/chaostoolkit_nimble/actions/jio/__init__.py b/chaostoolkit_nimble/actions/jio/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/actions/jio/common_actions.py b/chaostoolkit_nimble/actions/jio/common_actions.py new file mode 100644 index 0000000..5a53181 --- /dev/null +++ b/chaostoolkit_nimble/actions/jio/common_actions.py @@ -0,0 +1,100 @@ +import logging + +from nimble.core.entity.components import Components +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.components.hadoop_utils import HadoopUtils +from nimble.core.utils.shell_utils import ShellUtils +from nimble.core.utils.date_utils import DateUtils, Timezone + + +class CommonActions(object): + + def __init__(self): + self._logger = logging.getLogger(__name__) + self.hadoop_utils = HadoopUtils + self.date_utils = DateUtils(Timezone.UTC.value) + self.date_format = "%Y-%m-%d %H:%M:%S" + self.bin_interval = 900 + + def hdfs_keytab(self): + hadoop_utils = HadoopUtils(NodeManager.node_obj) + mgt_alias = hadoop_utils.master_namenode + hdfs_keytab = NodeManager.node_obj.execute_command_on_node(mgt_alias,"klist -tke /etc/security/keytabs/hdfs.headless.keytab |grep '@' |awk -F' ' '{print $4}'|head -1") + kinit_command = "kinit -kt /etc/security/keytabs/hdfs.headless.keytab %s" % hdfs_keytab.stdout + NodeManager.node_obj.execute_command_on_component(Components.MASTER_NAMENODE.name,ShellUtils.su("hdfs", kinit_command)) + + def computeBinTime(self, timestamp): + epoch_time = self.date_utils.convert_human_readable_to_epoch(timestamp, self.date_format) + bin_epoch = self.date_utils.round_down_epoch(epoch_time, self.bin_interval) + bin_timestamp = self.date_utils.convert_epoch_to_human_readable(bin_epoch, self.date_format) + return bin_timestamp + + def calc_end_time(self, start_time, duration): + epoch_start_time = self.date_utils.convert_human_readable_to_epoch(start_time[:-4], self.date_format) + epoch_end_time = str(format((float(epoch_start_time) + float(duration)), '.6f')) + end_time = self.date_utils.convert_epoch_to_human_readable(float(epoch_end_time), self.date_format) + return end_time + + def date_format_changer(self, input_date, input_date_format, output_date_format): + epoch_input = self.date_utils.convert_human_readable_to_epoch(input_date,input_date_format) + return self.date_utils.convert_epoch_to_human_readable(epoch_input, output_date_format) + + def get_time_range_list(self, min_time, max_time, frequency, date_time_format): + tmp_list = [] + time_range_list = [] + min_time_epoch = self.date_utils.convert_human_readable_to_epoch(min_time, date_time_format) + max_time_epoch = self.date_utils.convert_human_readable_to_epoch(max_time, date_time_format) + min_round_down = self.date_utils.round_down_epoch(min_time_epoch, frequency) + max_round_down = self.date_utils.round_up_epoch(max_time_epoch, frequency) + if min_round_down == max_round_down: + max_round_up = self.date_utils.round_up_epoch(max_time_epoch, frequency) + time_range_list = [(self.date_utils.convert_epoch_to_human_readable(min_round_down, date_time_format), + self.date_utils.convert_epoch_to_human_readable(max_round_up, date_time_format))] + else: + while min_round_down <= max_round_down: + tmp_list.append(min_round_down) + min_round_down += frequency + for index in range(0, (len(tmp_list) - 1)): + time_range_list.append(( + self.date_utils.convert_epoch_to_human_readable(tmp_list[index], date_time_format), + self.date_utils.convert_epoch_to_human_readable(tmp_list[index + 1], + date_time_format))) + return time_range_list + + def get_time_colums(self, frequency): + if frequency < 60: + return "year,month,day,hour,minute,seconds" + elif frequency >= 60 and frequency < 3600: + return "year,month,day,hour,minute" + elif frequency <= 3600 and frequency < 86400: + return "year,month,day,hour" + elif frequency <= 86400 and frequency < 2592000: + return "year,month,day" + elif frequency <= 2592000 and frequency < 31536000: + return "year,month" + elif frequency <= 31536000: + return "year" + else: + self._logger.error("Incorrect Frequency Provided : %s" % frequency) + + def time_column_needful(self, frequency, timestamp_in_ns): + year = int(timestamp_in_ns[:4]) + month = int(timestamp_in_ns[4:6]) + day = int(timestamp_in_ns[6:8]) + + time_columns = self.get_time_colums(frequency) + + if len(time_columns.split(',')) == 3: + return [year, month, day] + if len(time_columns.split(',')) == 4: + hour = int(timestamp_in_ns[8:10]) + return [year, month, day, hour] + if len(time_columns.split(',')) == 5: + hour = int(timestamp_in_ns[8:10]) + minute = int(timestamp_in_ns[10:12]) + return [year, month, day, hour, minute] + if len(time_columns.split(',')) == 6: + hour = int(timestamp_in_ns[8:10]) + minute = int(timestamp_in_ns[10:12]) + second = int(timestamp_in_ns[12:14]) + return [year, month, day, hour, minute, second] \ No newline at end of file diff --git a/chaostoolkit_nimble/actions/jio/media_plane_actions.py b/chaostoolkit_nimble/actions/jio/media_plane_actions.py new file mode 100644 index 0000000..c253f0e --- /dev/null +++ b/chaostoolkit_nimble/actions/jio/media_plane_actions.py @@ -0,0 +1,327 @@ +import collections +import logging +import mmh3 +import re + +from chaostoolkit_nimble.actions.jio.common_actions import CommonActions +from nimble.actions.base.regression.config_actions import ConfigActions +from nimble.core.entity.components import Components +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.shell_utils import ShellUtils + + +class MediaPlaneActions(object): + + def __init__(self, job_alias, config_parser, job_user="ambari-qa", component=Components.MANAGEMENT.name): + + self._logger = logging.getLogger(__name__) + self.job_alias = job_alias + self.comman_action = CommonActions() + self.seed = 42 + self.date_format_all = '%Y%m%d%H%M%S' + self.date_format_usual = "%Y-%m-%d %H:%M:%S" + self.frequency = 900 + self.config_actions = ConfigActions() + self.job_user = job_user + actual_output_configs = config_parser.get_job_actual_output_source_configs(self.job_alias, "output1") + self.database_name = actual_output_configs["db_name"] + self.table_name = actual_output_configs["table_name"] + self.node_alias = NodeManager.node_obj.get_node_aliases_by_component(component)[0] + self.job_base_directory = "/data/jio_copy/microapp1/" + self.job_stdout_file = "a.out" + + def schedule_15_min_job(self): + job_config_file = "%s/conf/MediaPlaneJob.json" % self.job_base_directory + job_script_file = "%s/scripts/media_plane_microapp1.sh" % self.job_base_directory + + ######## Update job config file + kwargs = collections.OrderedDict() + kwargs["config_separator"] = "=" + kwargs["location"] = "local" + kwargs["base_path"] = job_config_file + kwargs["file_format"] = "json" + kwargs["components"] = [Components.MANAGEMENT.name] + kwargs["properties"] = {"mediaPlaneRawInput.type": "csv", "mediaPlaneRawInput.header": "true", + "mediaPlaneRawInput.pathPrefix": "/tmp/partition_date=", + "mediaPlaneProcessedOutput.tableName": "%s.%s" % (self.database_name, self.table_name)} + self.config_actions.update_configs(**kwargs) + + ######## Update job script file + NodeManager.node_obj.execute_command_on_node(self.node_alias, + ShellUtils.find_and_replace_whole_line_in_file("basedirectory=", + "basedirectory=/data/jio_copy/microapp1/", + job_script_file)) + NodeManager.node_obj.execute_command_on_node(self.node_alias, + ShellUtils.find_and_replace_whole_line_in_file("lastdayepoch=", + """lastdayepoch=`date -d "2019-07-20 05:30:00" +%s`""", + job_script_file)) + NodeManager.node_obj.execute_command_on_node(self.node_alias, ShellUtils.find_and_replace_in_file( + "--timeIncrementInFilesInMin=15", "--timeIncrementInFilesInMin=15", job_script_file)) + NodeManager.node_obj.execute_command_on_node(self.node_alias, ShellUtils.find_and_replace_in_file( + "--durationOfDataToProcessInMin=15", "--durationOfDataToProcessInMin=15", job_script_file)) + ############# Run job on management node + job_run_command = "export SPARK_HOME=/usr/hdp/2.6.5.0-292/spark2 && cd %s && nohup scripts/media_plane_microapp1.sh >> %s 2>>%s &" % ( + self.job_base_directory, self.job_stdout_file, self.job_stdout_file) + NodeManager.node_obj.execute_remote_command_in_bg(self.node_alias, + ShellUtils.su(self.job_user, job_run_command)) + return NodeManager.node_obj.execute_command_on_node(self.node_alias, "sleep 5s ; %s" % + ShellUtils.fetch_process_id(self.job_alias)).stdout != "" + + def none_or_value(self, value): + if str(value) == "": + return None + else: + return value + + def generate_cell_id(self, mcc, mnc, cell_id): + if mcc == '' or mnc == '' or cell_id == '': + return '-1' + else: + if str(mnc) != 3: + final_mnc = '0' * (3 - len(str(mnc))) + str(mnc) + else: + final_mnc = mnc + + return str(mcc) + str(final_mnc) + '0' * (9 - len(str(cell_id))) + str(cell_id) + + def generate_hash(self, source_ip, source_port, dest_ip, dest_port): + hash_input_string = "%s|%s|%s|%s" % (source_ip, source_port, dest_ip, dest_port) + hash_value = mmh3.hash(hash_input_string, self.seed) + return hash_value + + def generate_binary_error_code(self, list_of_flags): + binary_string = "".join(list_of_flags) + binary_code = int(binary_string, 2) + return binary_code + + def link_specific_list(self, list): + default_list = [None, None, None, None, None, None, None, None, None, None, None, None, None, None] + if len(list) == 0: + return default_list + else: + if self.none_or_value(list[34]).startswith("2405:0203") or self.none_or_value(list[34]).startswith( + "2405:0204") or self.none_or_value(list[34]).startswith("2405:0205") or self.none_or_value( + list[34]).startswith("2409:4000"): + return [self.none_or_value(list[34]), self.none_or_value(list[35]), self.none_or_value(list[19]), + self.none_or_value(list[20]), self.none_or_value(list[45]), self.none_or_value(list[43]), + self.none_or_value(list[11]), self.none_or_value(list[12]), self.none_or_value(list[22]), + self.none_or_value(list[26]), self.none_or_value(list[27]), self.none_or_value(list[32]), + self.none_or_value(list[33]), self.none_or_value(list[36])] + # m_u_source_ip , m_u_source_port , m_u_destination_ip , m_u_destination_port , subscriber_id , subscriber_msisdn , media_active_time_seconds , media_completed_indicator , media_long_call_indicator , media_middle_gap_indicator , media_short_call_indicator , media_single_direction_indicator , media_start_gap_indicator + # checked if its uplink record + + else: + return [self.none_or_value(list[19]), self.none_or_value(list[20]), self.none_or_value(list[34]), + self.none_or_value(list[35]), self.none_or_value(list[45]), self.none_or_value(list[43]), + self.none_or_value(list[11]), self.none_or_value(list[12]), self.none_or_value(list[22]), + self.none_or_value(list[26]), self.none_or_value(list[27]), self.none_or_value(list[32]), + self.none_or_value(list[33]), self.none_or_value(list[36])] + # m_u_destination_ip , m_u_destination_port , m_u_source_ip , m_u_source_port, subscriber_id , subscriber_msisdn , media_active_time_seconds , media_completed_indicator , media_long_call_indicator , media_middle_gap_indicator , media_short_call_indicator , media_single_direction_indicator , media_start_gap_indicator + # checked if its down_link record + + def validate_media_plane(self, validation_entities): + hashed_row_dict = {} + min_time = \ + validation_entities.sqlite_adapter.select("select min(cal_timestamp_time) from %s_input1" % self.job_alias)[ + -1][ + 0] + + max_time = \ + validation_entities.sqlite_adapter.select("select max(cal_timestamp_time) from %s_input1" % self.job_alias)[ + -1][ + 0] + + time_range_list = self.comman_action.get_time_range_list(min_time, max_time, self.frequency, + date_time_format=self.date_format_usual) + media_header = ['m_hash_tuple', 'm_timestamp', 'm_u_source_ip', 'm_u_source_port', 'm_u_destination_ip', + 'm_u_destination_port', 'm_u_imsi', 'm_u_msisdn', 'm_u_call_duration', 'm_u_call_completed', + 'm_u_end_gap_indicator', 'm_u_long_call_indicator', 'm_u_middle_gap_indicator', + 'm_u_short_call_indicator', 'm_u_one_way_audio', 'm_u_start_gap_indicator', + 'm_u_weighted_jitter_total', 'm_u_weighted_mos_total', 'm_u_weighted_packet_loss_total', + 'm_u_weighted_rtd_total', 'm_u_jitter_sum', 'm_u_rtd_sum', 'm_u_packet_loss_sum', + 'm_u_degradation_sum', 'm_u_cell_id', 'm_d_destination_ip', 'm_d_destination_port', + 'm_d_source_ip', 'm_d_source_port', 'm_d_imsi', 'm_d_msisdn', 'm_d_call_duration', + 'm_d_call_completed', 'm_d_end_gap_indicator', 'm_d_long_call_indicator', + 'm_d_middle_gap_indicator', 'm_d_short_call_indicator', 'm_d_one_way_audio', + 'm_d_start_gap_indicator', 'm_d_weighted_jitter_total', 'm_d_weighted_mos_total', + 'm_d_weighted_packet_loss_total', 'm_d_weighted_rtd_total', 'm_d_jitter_sum', 'm_d_rtd_sum', + 'm_d_packet_loss_sum', 'm_d_degradation_sum', 'm_d_cell_id', 'm_error_code', 'm_ue_ip', + 'm_msisdn', 'm_imsi', 'm_cell_id', 'm_call_id', 'm_weighted_mos_sum', 'm_weighted_jitter_sum', + 'm_weighted_packet_loss_sum', 'm_weighted_rtd_sum', 'm_degradation_sum', 'm_jitter_sum', + 'm_packet_loss_sum', 'm_rtd_sum', 'm_mos', 'm_jitter', 'm_packet_loss', 'm_rtd', + 'm_binned_timestamp', 'sql_timestamp'] + final_media_dump = [media_header] + time_columns = self.comman_action.get_time_colums(frequency=self.frequency) + final_media_dump[0].extend(time_columns.split(',')) + for time_range in time_range_list: + where_clause = "cal_timestamp_time >= '%s' and cal_timestamp_time < '%s'" % (time_range[0], time_range[1]) + total_dump = validation_entities.sqlite_adapter.select( + "select * from %s_input1 where %s" % (self.job_alias, where_clause)) + temp_list = [] + hashed_row_dict = {} + for row in total_dump[1:]: + temp_dict = {} + row = list(row) + if row[19].startswith("2405:0203") or row[19].startswith("2405:0204") or row[19].startswith( + "2405:0205") or row[19].startswith("2409:4000"): + hash_value = self.generate_hash(row[19], row[20], row[34], row[35]) + elif row[34].startswith("2405:0203") or row[34].startswith("2405:0204") or row[34].startswith( + "2405:0205") or row[34].startswith("2409:4000"): + hash_value = self.generate_hash(row[34], row[35], row[19], row[20]) + else: + continue + if hash_value in hashed_row_dict.keys(): + if row[19].startswith("2405:0203") or row[19].startswith("2405:0204") or row[19].startswith( + "2405:0205") or row[19].startswith("2409:4000"): + hashed_row_dict[hash_value]["downlink"] = row + elif row[34].startswith("2405:0203") or row[34].startswith("2405:0204") or row[34].startswith( + "2405:0205") or row[34].startswith("2409:4000"): + hashed_row_dict[hash_value]["uplink"] = row + else: + continue + else: + if row[19].startswith("2405:0203") or row[19].startswith("2405:0204") or row[19].startswith( + "2405:0205") or row[19].startswith("2409:4000"): + temp_dict = {"downlink": row, "uplink": []} + elif row[34].startswith("2405:0203") or row[34].startswith("2405:0204") or row[34].startswith( + "2405:0205") or row[34].startswith("2409:4000"): + temp_dict = {"downlink": [], "uplink": row} + else: + continue + hashed_row_dict[hash_value] = temp_dict + + for hash_value in hashed_row_dict.keys(): + error_code_list = ['1', '1'] + media_uplink_row = hashed_row_dict[hash_value]["uplink"] + media_downlink_row = hashed_row_dict[hash_value]["downlink"] + media_u_row = self.link_specific_list(media_uplink_row) + media_d_row = self.link_specific_list(media_downlink_row) + if len(media_uplink_row) == 0: + m_timestamp = (re.sub('[^A-Za-z0-9]+', "", media_downlink_row[1])) + m_binned_timestamp = ( + re.sub('[^A-Za-z0-9]+', "", self.comman_action.computeBinTime(media_downlink_row[1]))) + sql_timestamp = self.comman_action.date_format_changer(m_binned_timestamp, "%Y%m%d%H%M%S", + "%Y-%m-%d %H:%M:%S") + m_ueip = media_downlink_row[19] + m_msisdn = media_uplink_row[43] + m_imsi = media_uplink_row[45] + error_code_list[0] = 0 + m_u_wei_mos = 0 + m_u_wei_jitter = 0 + m_u_wei_pakt_los = 0 + m_u_wei_rtd = 0 + m_u_degradation_sum = 0 + m_u_jitter_sum = 0 + m_u_pakt_los_sum = 0 + m_u_rtd_sum = 0 + m_u_cell_id = -1 + m_u_calculated_item = [str(m_u_wei_jitter), str(m_u_wei_mos), str(m_u_wei_pakt_los), + str(m_u_wei_rtd), str(m_u_jitter_sum), str(m_u_rtd_sum), + str(m_u_pakt_los_sum), str(m_u_degradation_sum), str(m_u_cell_id)] + m_cell_id = m_u_cell_id + + else: + m_timestamp = (re.sub('[^A-Za-z0-9]+', "", media_uplink_row[1])) + m_binned_timestamp = ( + re.sub('[^A-Za-z0-9]+', "", self.comman_action.computeBinTime(media_uplink_row[1]))) + sql_timestamp = self.comman_action.date_format_changer(m_binned_timestamp, "%Y%m%d%H%M%S", + "%Y-%m-%d %H:%M:%S") + m_ueip = media_uplink_row[34] + m_msisdn = media_uplink_row[43] + m_imsi = media_uplink_row[45] + m_u_wei_mos = int(media_uplink_row[38]) + m_u_wei_jitter = int(media_uplink_row[37]) + m_u_wei_pakt_los = int(media_uplink_row[39]) + m_u_wei_rtd = int(media_uplink_row[40]) + m_u_degradation_sum = int(media_uplink_row[13]) + int(media_uplink_row[14]) + int( + media_uplink_row[15]) + m_u_jitter_sum = int(media_uplink_row[23]) + int(media_uplink_row[24]) + int(media_uplink_row[25]) + m_u_pakt_los_sum = str( + int(media_uplink_row[28]) + int(media_uplink_row[29]) + int(media_uplink_row[30])) + m_u_rtd_sum = int(media_uplink_row[16]) + int(media_uplink_row[17]) + int(media_uplink_row[18]) + m_u_cell_id = self.generate_cell_id(media_uplink_row[6], media_uplink_row[7], media_uplink_row[3]) + m_u_calculated_item = [str(m_u_wei_jitter), str(m_u_wei_mos), str(m_u_wei_pakt_los), + str(m_u_wei_rtd), str(m_u_jitter_sum), str(m_u_rtd_sum), + str(m_u_pakt_los_sum), str(m_u_degradation_sum), str(m_u_cell_id)] + m_cell_id = m_u_cell_id + + if len(media_downlink_row) == 0: + m_timestamp = (re.sub('[^A-Za-z0-9]+', "", media_uplink_row[1])) + m_binned_timestamp = ( + re.sub('[^A-Za-z0-9]+', "", self.comman_action.computeBinTime(media_uplink_row[1]))) + sql_timestamp = self.comman_action.date_format_changer(m_binned_timestamp, "%Y%m%d%H%M%S", + "%Y-%m-%d %H:%M:%S") + m_ueip = media_uplink_row[34] + m_msisdn = media_downlink_row[43] + m_imsi = media_downlink_row[45] + error_code_list[1] = 0 + m_d_wei_mos = 0 + m_d_wei_jitter = 0 + m_d_wei_pakt_los = 0 + m_d_wei_rtd = 0 + m_d_degradation_sum = 0 + m_d_jitter_sum = 0 + m_d_pakt_los_sum = 0 + m_d_rtd_sum = 0 + m_d_cell_id = -1 + m_d_calculated_item = [str(m_d_wei_jitter), str(m_d_wei_mos), str(m_d_wei_pakt_los), + str(m_d_wei_rtd), str(m_d_jitter_sum), str(m_d_rtd_sum), + str(m_d_pakt_los_sum), str(m_d_degradation_sum), str(m_d_cell_id)] + m_cell_id = m_d_cell_id + + else: + m_timestamp = (re.sub('[^A-Za-z0-9]+', "", media_downlink_row[1])) + m_binned_timestamp = ( + re.sub('[^A-Za-z0-9]+', "", self.comman_action.computeBinTime(media_downlink_row[1]))) + sql_timestamp = self.comman_action.date_format_changer(m_binned_timestamp, "%Y%m%d%H%M%S", + "%Y-%m-%d %H:%M:%S") + m_ueip = media_downlink_row[19] + m_msisdn = media_downlink_row[43] + m_imsi = media_downlink_row[45] + m_d_wei_mos = int(media_downlink_row[38]) + m_d_wei_jitter = int(media_downlink_row[37]) + m_d_wei_pakt_los = int(media_downlink_row[39]) + m_d_wei_rtd = int(media_downlink_row[40]) + m_d_degradation_sum = int(media_downlink_row[13]) + int(media_downlink_row[14]) + int( + media_downlink_row[15]) + m_d_jitter_sum = int(media_downlink_row[23]) + int(media_downlink_row[24]) + int( + media_downlink_row[25]) + m_d_pakt_los_sum = int(media_downlink_row[28]) + int(media_downlink_row[29]) + int( + media_downlink_row[30]) + m_d_rtd_sum = int(media_downlink_row[16]) + int(media_downlink_row[17]) + int( + media_downlink_row[18]) + m_d_cell_id = self.generate_cell_id(media_downlink_row[6], media_downlink_row[7], + media_downlink_row[3]) + m_d_calculated_item = [str(m_d_wei_jitter), str(m_d_wei_mos), str(m_d_wei_pakt_los), + str(m_d_wei_rtd), str(m_d_jitter_sum), str(m_d_rtd_sum), + str(m_d_pakt_los_sum), str(m_d_degradation_sum), str(m_d_cell_id)] + m_cell_id = m_d_cell_id + + m_hash_tuple = hash_value + m_error_code = self.generate_binary_error_code(error_code_list) + m_call_id = "m_" + str(m_ueip) + "_" + str(m_hash_tuple) + m_wei_mos = int(m_u_wei_mos) + int(m_d_wei_mos) + m_wei_jitter = int(m_u_wei_jitter) + int(m_d_wei_jitter) + m_wei_pkt_los = int(m_u_wei_pakt_los) + int(m_d_wei_pakt_los) + m_wei_rtd = int(m_u_wei_rtd) + int(m_d_wei_rtd) + m_degradation_sum = int(m_u_degradation_sum) + int(m_d_degradation_sum) + m_jitter_sum = int(m_u_jitter_sum) + int(m_d_jitter_sum) + m_pkt_loss_sum = int(m_u_pakt_los_sum) + int(m_d_pakt_los_sum) + m_rtd_sum = int(m_u_rtd_sum) + int(m_d_rtd_sum) + m_mos = float(m_wei_mos) / m_degradation_sum / 100 + m_jitter = float(m_wei_jitter) / m_jitter_sum + m_packet_loss = float(m_wei_pkt_los) / m_pkt_loss_sum / 100 + m_rtd = float(m_wei_rtd) / m_rtd_sum + + time_array = self.comman_action.time_column_needful(self.frequency, m_binned_timestamp) + + media_output_list = [str(m_hash_tuple), str( + m_timestamp)] + media_u_row + m_u_calculated_item + media_d_row + m_d_calculated_item + [ + str(m_error_code), str(m_ueip), str(m_msisdn), str(m_imsi), str(m_cell_id), + str(m_call_id), str(m_wei_mos), str(m_wei_jitter), str(m_wei_pkt_los), + str(m_wei_rtd), str(m_degradation_sum), str(m_jitter_sum), str(m_pkt_loss_sum), + str(m_rtd_sum), str(m_mos), str(m_jitter), str(m_packet_loss), str(m_rtd), + str(m_binned_timestamp), str(sql_timestamp)] + time_array + temp_list.append(media_output_list) + final_media_dump.extend(temp_list) + validation_entities.output_obj[self.job_alias]["output1"] = final_media_dump diff --git a/chaostoolkit_nimble/actions/sample/__init__.py b/chaostoolkit_nimble/actions/sample/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/actions/sample/sample_application_actions.py b/chaostoolkit_nimble/actions/sample/sample_application_actions.py new file mode 100644 index 0000000..589c512 --- /dev/null +++ b/chaostoolkit_nimble/actions/sample/sample_application_actions.py @@ -0,0 +1,10 @@ +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.components.hadoop_utils import HadoopUtils + + +def launch_application(): + hadoop_utils = HadoopUtils() + master_namenode = hadoop_utils.master_namenode + command = "sleep 5m" + NodeManager.node_obj.execute_command_on_node(master_namenode, command) + a = 1 diff --git a/chaostoolkit_nimble/controllers/__init__.py b/chaostoolkit_nimble/controllers/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/controllers/base/__init__.py b/chaostoolkit_nimble/controllers/base/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/controllers/base/control.py b/chaostoolkit_nimble/controllers/base/control.py new file mode 100644 index 0000000..03acbca --- /dev/null +++ b/chaostoolkit_nimble/controllers/base/control.py @@ -0,0 +1,31 @@ +from chaoslib.types import Configuration, \ + Experiment, Secrets, Settings +from logzero import logger + +from nimble.core import global_constants +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.shell_utils import ShellUtils + + +def configure_control(configuration: Configuration = None, + secrets: Secrets = None, settings: Settings = None, + experiment: Experiment = None): + """ + Configure the control's global state + + This is called once only per Chaos Toolkit's run and should be used to + initialize any state your control may require. + + The `settings` are only passed when the control is declared in the + settings file of the Chaos Toolkit. + """ + setup_files_base_path = "%s/setup" % global_constants.DEFAULT_LOCAL_TMP_PATH + testbed_file = ShellUtils.execute_shell_command( + ShellUtils.find_files_in_directory(setup_files_base_path, file_name_regex="open_nebula_*")).stdout + component_attributes_file = ShellUtils.execute_shell_command( + ShellUtils.find_files_in_directory(setup_files_base_path, file_name_regex="component_*")).stdout + if testbed_file and component_attributes_file: + NodeManager.initialize(testbed_file, component_attributes_file) + logger.debug("NODE_OBJ VIP FROM BASE CONTROLLER----------------: %s" % NodeManager.node_obj.vip) + else: + raise Exception("Either testbed or component attributes yaml file not found in chaos!") diff --git a/chaostoolkit_nimble/controllers/chaosk8s/__init__.py b/chaostoolkit_nimble/controllers/chaosk8s/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/controllers/chaosk8s/control.py b/chaostoolkit_nimble/controllers/chaosk8s/control.py new file mode 100644 index 0000000..e5177f0 --- /dev/null +++ b/chaostoolkit_nimble/controllers/chaosk8s/control.py @@ -0,0 +1,40 @@ +import time +from typing import List + +from chaoslib.types import Configuration, \ + Experiment, Run, Secrets, Activity + + +def after_activity_control(context: Activity, state: Run, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + after-control of the activity's execution + + Called by the Chaos Toolkit before the activity is applied. The result of + the execution is passed as `state`. See + https://docs.chaostoolkit.org/reference/api/journal/#run for more + information. + """ + print("----------------STATE AFTER ACTIVITY: %s" %state) + +def after_method_control(context: Experiment, state: List[Run], + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + after-control of the method's execution + + Called by the Chaos Toolkit after the activities of the method have been + applied. The `state` is the list of activity results. See + https://docs.chaostoolkit.org/reference/api/journal/#run for more + information. + """ + print("----------------STATE AFTER METHOD: %s" % state) + for run in state: + activity_obj = run["activity"] + activity_name = activity_obj["name"] + run_status = run["status"] + if "terminate_gracefully_pod_" in activity_name and run_status == "succeeded": + time.sleep(60) + elif "read_new_spawned_logs_for_pod" in activity_name and run_status == "succeeded": + print(run["output"].keys()) diff --git a/chaostoolkit_nimble/controllers/process/__init__.py b/chaostoolkit_nimble/controllers/process/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/controllers/process/control.py b/chaostoolkit_nimble/controllers/process/control.py new file mode 100644 index 0000000..b5d4754 --- /dev/null +++ b/chaostoolkit_nimble/controllers/process/control.py @@ -0,0 +1,48 @@ +from time import sleep + +from chaoslib.types import Configuration, \ + Experiment, Run, Secrets, Activity +from chaostoolkit_nimble.controllers.base import control +from logzero import logger + +control.configure_control() + + +def after_activity_control(context: Activity, state: Run, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + after-control of the activity's execution + + Called by the Chaos Toolkit before the activity is applied. The result of + the execution is passed as `state`. See + https://docs.chaostoolkit.org/reference/api/journal/#run for more + information. + """ + logger.debug("----------------STATE AFTER ACTIVITY: %s" % state) + + +def before_method_control(context: Experiment, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + before-control of the method's execution + + Called by the Chaos Toolkit before the activities of the method are + applied. + """ + logger.debug("----------------CONFIGURATION BEFORE METHOD: %s" % configuration) + sleep(3) + + +def after_method_control(context: Experiment, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + before-control of the method's execution + + Called by the Chaos Toolkit before the activities of the method are + applied. + """ + logger.debug("----------------CONFIGURATION AFTER METHOD: %s" % configuration) + sleep(120) diff --git a/chaostoolkit_nimble/controllers/spark/__init__.py b/chaostoolkit_nimble/controllers/spark/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/controllers/spark/control.py b/chaostoolkit_nimble/controllers/spark/control.py new file mode 100644 index 0000000..5e3b043 --- /dev/null +++ b/chaostoolkit_nimble/controllers/spark/control.py @@ -0,0 +1,80 @@ +from chaoslib.types import Experiment, Configuration, Secrets, Activity, Run, Journal +from logzero import logger +from retrying import RetryError + +from chaostoolkit_nimble.controllers.base import control +from nimble.core.utils.components.hadoop_utils import HadoopRestClientUtils +from nimble.core.utils.date_utils import DateUtils, Timezone + +control.configure_control() +APPLICATION_ID = None + + +def after_activity_control(context: Activity, state: Run, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + after-control of the activity's execution + + Called by the Chaos Toolkit before the activity is applied. The result of + the execution is passed as `state`. See + https://docs.chaostoolkit.org/reference/api/journal/#run for more + information. + """ + logger.debug("----------------STATE AFTER ACTIVITY: %s" % state) + + +def before_method_control(context: Experiment, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + before-control of the method's execution + + Called by the Chaos Toolkit before the activities of the method are + applied. + """ + logger.debug("----------------CONFIGURATION BEFORE METHOD: %s" % configuration) + + +def after_method_control(context: Experiment, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + before-control of the method's execution + + Called by the Chaos Toolkit before the activities of the method are + applied. + """ + logger.debug("----------------CONFIGURATION AFTER METHOD: %s" % configuration) + + +def after_experiment_control(context: Experiment, state: Journal, + configuration: Configuration = None, + secrets: Secrets = None, **kwargs): + """ + after-control of the experiment's execution + + Called by the Chaos Toolkit after the experiment's completed. It passes the + journal of the execution. At that stage, the after control has no influence + over the execution however. Please see + https://docs.chaostoolkit.org/reference/api/journal/#journal-elements + for more information about the journal. + """ + date_utils = DateUtils(Timezone.UTC.value) + logger.debug("AFTER EXPERIMENT CONTROL: %s" % state) + hadoop_rest_client_utils = HadoopRestClientUtils() + try: + hadoop_rest_client_utils.wait_for_yarn_job_to_finish(APPLICATION_ID) + job_stats = hadoop_rest_client_utils.get_yarn_job_details(APPLICATION_ID) + logger.info("Total execution time for yarn job with application id %s: %s ms (i.e %s minutes) " % ( + APPLICATION_ID, job_stats["app"]["elapsedTime"], + date_utils.get_minutes_from_milliseconds(job_stats["app"]["elapsedTime"]))) + except RetryError: + try: + hadoop_rest_client_utils.wait_for_yarn_job_to_finish(APPLICATION_ID) + job_stats = hadoop_rest_client_utils.get_yarn_job_details(APPLICATION_ID) + logger.info("Total execution time for yarn job with application id %s: %s ms (i.e %s minutes) " % ( + APPLICATION_ID, job_stats["app"]["elapsedTime"], + date_utils.get_minutes_from_milliseconds(job_stats["app"]["elapsedTime"]))) + except RetryError: + logger.info("Yarn job with application id %s is not in 'FINISHED' state. Please check." % APPLICATION_ID) diff --git a/chaostoolkit_nimble/core/__init__.py b/chaostoolkit_nimble/core/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/core/configs/__init__.py b/chaostoolkit_nimble/core/configs/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/core/configs/chaos_exp_config_parser.py b/chaostoolkit_nimble/core/configs/chaos_exp_config_parser.py new file mode 100644 index 0000000..d55eef5 --- /dev/null +++ b/chaostoolkit_nimble/core/configs/chaos_exp_config_parser.py @@ -0,0 +1,150 @@ +import logging +import os + +from nimble.core import global_constants +from nimble.core.adapters.sqlite.sqlite_adapter import SqliteAdapter +from nimble.core.configs.base_yaml_parser import BaseYamlParser +from nimble.core.configs.scheduler_config_factory import SchedulerConfigFactory +from nimble.core.configs.source_config_factory import SourceConfigFactory +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.string_utils import StringUtils + + +class ChaosExpConfigParser(BaseYamlParser): + """Methods to fetch the validation attributes from the validation YAML config.""" + + def __init__(self, config_path, node_obj=NodeManager.node_obj): + """ + + :param config_path: Path of the config file to be used. + :type config_path: str + :type node_obj: :class:`nimble.core.entity.nodes.Nodes` + """ + self._logger = logging.getLogger(__name__) + super(ChaosExpConfigParser, self).__init__() + self.config_path = config_path + # self.config_obj = self.load_configs(config_path) + # self.project = self.get_defaults_from_config()["project"] + # self.build = self.get_defaults_from_config()["build"] + # self.customer = self.get_defaults_from_config()["customer"] + # self.stop_jobs_flag = self.get_defaults_from_config().get("stop_jobs", True) + # self.golden_build = self.get_attribute_or_default_or_pass(self.get_defaults_from_config(), "golden_build") + # self.mail_to = self.get_defaults_from_config()["mail_to"] + # self.output_file_name = "output.txt" + # self.ib_tmp_file = "ib_tmp_file.txt" + # self.input_tmp_file = "input_tmp_file.txt" + # self.project = self.get_defaults_from_config()["project"] + # self.build = self.get_defaults_from_config()["build"] + # self.customer = self.get_defaults_from_config()["customer"] + # self.base_http_path = "modules/%s/%s/%s" % (self.project, self.golden_build, self.customer) + # self.base_latest_http_path = "modules/%s/%s_latest/%s" % (self.project, self.build, self.customer) + # self.separator = ',' + # self.sqlite_file_path = "%s/validation_entities.db" % global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH + # self.urlcat_delimiter = "^^" + # self.sqlite_adapter = SqliteAdapter(db_file=self.sqlite_file_path) + # node_obj.fetch_timestamp_from_server() + + # def get_job_schedule_configs(self, job_alias): + # return SchedulerConfigFactory.get(self.get_job_schedule_source(job_alias), self.config_obj).get_configs( + # self.get_job_schedule(job_alias)) + # + # def get_ibs(self): + # return self.get_attribute_or_default_or_pass(self.config_obj, "ibs") + # + # def get_ib_attributes(self, ib_alias): + # return self.get_attribute_or_pass(self.get_ibs(), ib_alias) + # + # def get_ib_source(self, ib_alias): + # return self.get_attribute_or_default_or_pass(self.get_ib_attributes(ib_alias), "source") + # + # def get_ib_source_configs(self, ib_alias): + # return SourceConfigFactory.get(self.get_ib_source(ib_alias), self.config_obj).get_configs( + # self.get_ib_attributes(ib_alias)) + # + # def get_input_source_configs(self, job_alias, input_alias): + # """Get the attributes for input source for a given job and input alias from the YAML config file. + # + # :param job_alias: Job alias for which the input source attributes are to be fetched. + # :type job_alias: str + # :param input_alias: Input alias for which the source attributes are to be fetched. + # :type input_alias: str + # :return: Ordered dictionary of input source attributes. + # :rtype: :class:`collections.OrderedDict` + # """ + # return SourceConfigFactory.get(self.get_input_source(job_alias, input_alias), self.config_obj).get_configs( + # self.get_input_attributes(job_alias, input_alias)) + # + # def get_job_actual_output_source_configs(self, job_alias, output_alias): + # """Get the attributes for actual output source for a given job and input alias from the YAML config file. + # + # :param job_alias: Job alias for which the actual output source attributes are to be fetched. + # :type job_alias: str + # :param output_alias: Output alias for which the actual output source attributes are to be fetched. + # :type output_alias: str + # :return: Ordered dictionary of actual output source attributes. + # :rtype: :class:`collections.OrderedDict` + # """ + # source = self.get_job_actual_output_source(job_alias, output_alias) + # return SourceConfigFactory.get(source, self.config_obj).get_configs( + # self.get_job_actual_output_attributes(job_alias, output_alias)) + # + # def get_urlcat_command_attributes(self, job_alias, input_alias): + # """Get the urlcat command attributes for the given job and input alias from the YAML config file. + # + # :param job_alias: Job alias for which the urlcat command attributes are to be fetched. + # :type job_alias: str + # :param input_alias: Input alias for which the urlcat command attributes are to be fetched. + # :type input_alias: str + # :return: Ordered dictionary of `urlcat_command` attributes. + # :rtype: :class:`collections.OrderedDict` + # """ + # return self.get_attribute_or_default_or_pass(self.get_input_attributes(job_alias, input_alias), + # "urlcat_command") + # + # def get_urlcat_command_input_select_query(self, job_alias, input_alias): + # """Get the input select query for the given job and input alias from the YAML config file. + # + # This is the query given to fetch data from sqlite which will form the input for URLCat input. The URLcat command + # will operate on the fields selected through this query and generate an output accordingly. + # + # :param job_alias: Job alias for which the input select query is to be fetched. + # :type job_alias: str + # :param input_alias: Input alias for which the input select query is to be fetched. + # :type input_alias: str + # :return: Input select query. + # :rtype: str + # """ + # return self.get_attribute_or_default_or_pass(self.get_urlcat_command_attributes(job_alias, input_alias), + # "input_select_query") + # + # def get_urlcat_regression_ibs(self): + # """Get the url paths for all four `iv` and `ibstore` ibs that are supplied as job input parameters on jenkins. + # + # :return: Return all ib's url paths or None for each ib in case ib parameters are not supplied with the job. + # :rtype: tuple + # """ + # try: + # iv_url_expected = os.environ["iv_url_expected"] + # ibstore_url_expected = os.environ["ibstore_url_expected"] + # iv_url_actual = os.environ["iv_url_actual"] + # ibstore_url_actual = os.environ["ibstore_url_actual"] + # return (iv_url_expected, ibstore_url_expected, iv_url_actual, + # ibstore_url_actual) + # except KeyError: + # return None, None, None, None + # + # def get_urlcat_ib_versions(self): + # """Get the version number for expected as well as actual ibs. + # + # :return: Return both the version numbers or None for each set of ibs in case ib parameters are not supplied + # with the job. + # :rtype: tuple + # """ + # try: + # ib_version_expected = \ + # StringUtils.none_safe_string(self.get_urlcat_regression_ibs()[0]).replace("//", "/").split("/")[4] + # ib_version_actual = \ + # StringUtils.none_safe_string(self.get_urlcat_regression_ibs()[2]).replace("//", "/").split("/")[4] + # return (ib_version_expected, ib_version_actual) + # except IndexError: + # return None, None diff --git a/chaostoolkit_nimble/core/exceptions/__init__.py b/chaostoolkit_nimble/core/exceptions/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/core/exceptions/custom_exceptions.py b/chaostoolkit_nimble/core/exceptions/custom_exceptions.py new file mode 100644 index 0000000..2fca835 --- /dev/null +++ b/chaostoolkit_nimble/core/exceptions/custom_exceptions.py @@ -0,0 +1,2 @@ +class ChaosActionFailedError(Exception): + pass diff --git a/chaostoolkit_nimble/core/extensions/__init__.py b/chaostoolkit_nimble/core/extensions/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaosk8s/__init__.py b/chaostoolkit_nimble/core/extensions/chaosk8s/__init__.py similarity index 100% rename from chaosk8s/__init__.py rename to chaostoolkit_nimble/core/extensions/chaosk8s/__init__.py diff --git a/chaosk8s/actions.py b/chaostoolkit_nimble/core/extensions/chaosk8s/actions.py similarity index 98% rename from chaosk8s/actions.py rename to chaostoolkit_nimble/core/extensions/chaosk8s/actions.py index 686fd65..c402ca3 100644 --- a/chaosk8s/actions.py +++ b/chaostoolkit_nimble/core/extensions/chaosk8s/actions.py @@ -9,7 +9,7 @@ from kubernetes.client.rest import ApiException from logzero import logger -from chaosk8s import create_k8s_api_client +from chaostoolkit_nimble.core.extensions import create_k8s_api_client __all__ = ["start_microservice", "kill_microservice", "scale_microservice", "remove_service_endpoint"] diff --git a/chaostoolkit_nimble/core/extensions/chaosk8s/node/__init__.py b/chaostoolkit_nimble/core/extensions/chaosk8s/node/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaosk8s/node/actions.py b/chaostoolkit_nimble/core/extensions/chaosk8s/node/actions.py similarity index 99% rename from chaosk8s/node/actions.py rename to chaostoolkit_nimble/core/extensions/chaosk8s/node/actions.py index e92a66f..b8cd918 100644 --- a/chaosk8s/node/actions.py +++ b/chaostoolkit_nimble/core/extensions/chaosk8s/node/actions.py @@ -13,7 +13,7 @@ from kubernetes.client.rest import ApiException from logzero import logger -from chaosk8s import create_k8s_api_client +from chaostoolkit_nimble.core.extensions import create_k8s_api_client __all__ = ["create_node", "delete_nodes", "cordon_node", "drain_nodes", "uncordon_node"] diff --git a/chaosk8s/node/probes.py b/chaostoolkit_nimble/core/extensions/chaosk8s/node/probes.py similarity index 90% rename from chaosk8s/node/probes.py rename to chaostoolkit_nimble/core/extensions/chaosk8s/node/probes.py index ffc0984..7f2aeef 100644 --- a/chaosk8s/node/probes.py +++ b/chaostoolkit_nimble/core/extensions/chaosk8s/node/probes.py @@ -4,7 +4,7 @@ from chaoslib.types import Configuration, Secrets from kubernetes import client -from chaosk8s import create_k8s_api_client +from chaostoolkit_nimble.core.extensions import create_k8s_api_client __all__ = ["get_nodes"] diff --git a/chaostoolkit_nimble/core/extensions/chaosk8s/pod/__init__.py b/chaostoolkit_nimble/core/extensions/chaosk8s/pod/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaosk8s/pod/actions.py b/chaostoolkit_nimble/core/extensions/chaosk8s/pod/actions.py similarity index 98% rename from chaosk8s/pod/actions.py rename to chaostoolkit_nimble/core/extensions/chaosk8s/pod/actions.py index 3f78e2f..d01b633 100644 --- a/chaosk8s/pod/actions.py +++ b/chaostoolkit_nimble/core/extensions/chaosk8s/pod/actions.py @@ -8,7 +8,7 @@ from kubernetes import client from logzero import logger -from chaosk8s import create_k8s_api_client +from chaostoolkit_nimble.core.extensions import create_k8s_api_client __all__ = ["terminate_pods"] diff --git a/chaosk8s/pod/probes.py b/chaostoolkit_nimble/core/extensions/chaosk8s/pod/probes.py similarity index 99% rename from chaosk8s/pod/probes.py rename to chaostoolkit_nimble/core/extensions/chaosk8s/pod/probes.py index 7e3e245..6b420fa 100644 --- a/chaosk8s/pod/probes.py +++ b/chaostoolkit_nimble/core/extensions/chaosk8s/pod/probes.py @@ -8,7 +8,7 @@ from kubernetes import client from logzero import logger -from chaosk8s import create_k8s_api_client +from chaostoolkit_nimble.core.extensions import create_k8s_api_client __all__ = [ "pods_in_phase", diff --git a/chaosk8s/probes.py b/chaostoolkit_nimble/core/extensions/chaosk8s/probes.py similarity index 98% rename from chaosk8s/probes.py rename to chaostoolkit_nimble/core/extensions/chaosk8s/probes.py index dda82f2..1aff508 100644 --- a/chaosk8s/probes.py +++ b/chaostoolkit_nimble/core/extensions/chaosk8s/probes.py @@ -7,8 +7,8 @@ from kubernetes import client, watch from logzero import logger -from chaosk8s import create_k8s_api_client -from chaosk8s.pod.probes import read_pod_logs +from chaostoolkit_nimble.core.extensions import create_k8s_api_client +from chaostoolkit_nimble.core.extensions.chaosk8s.pod import read_pod_logs __all__ = ["all_microservices_healthy", "microservice_available_and_healthy", "microservice_is_not_available", "service_endpoint_is_initialized", diff --git a/chaostoolkit_nimble/core/utils/__init__.py b/chaostoolkit_nimble/core/utils/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/core/utils/fabric_utils.py b/chaostoolkit_nimble/core/utils/fabric_utils.py new file mode 100644 index 0000000..9da0520 --- /dev/null +++ b/chaostoolkit_nimble/core/utils/fabric_utils.py @@ -0,0 +1,62 @@ +import logging + +# from nimble.core import global_constants +# from nimble.core.entity.guavus_response import GuavusResponse +# from nimble.core.utils.fabric_utils import FabricUtils +from fabric import operations +from fabric.context_managers import hide + +_LOGGER = logging.getLogger(__name__) + +b = 56 + +def run_command_on_remote(command, ip, username, password, command_timeout=None, + connection_timeout=120): + """Run a shell command on a remote server. + + :param command: Command that is to be executed on the shell of the remote server. + :type command: str + :param ip: Remote server ip on which command is to be fired. + :type ip: str + :param username: Username to be used for login on the remote server. + :type username: str + :param password: Password to be used for login on the remote server. + :type password: str + :param command_timeout: Time(in seconds) to wait for the given `command` to get executed, after which + the `CommandTimeout` exception will be raised. + :type command_timeout: int + :param connection_timeout: Time(in seconds) to wait for the connection to get established with the remote server, + after which the `ConnectionTimeout` exception will be raised. + :type connection_timeout: int + :return: Return the result of the command being executed on the remote server. + :rtype: :class:`operations._AttributeString` + """ + set_fabric_environment(ip, username, password, connection_timeout=connection_timeout) + with hide("output"): # pylint: disable=not-context-manager + ############ return operations.run(command, timeout=command_timeout) + return operations.run(command, timeout=command_timeout).stdout + +def set_fabric_environment(ip, username, password, sudo_password=None, + connection_timeout=120): + """Set the basic fabric environment variables upon which the other fabric utilities will be operated. + + :param ip: Remote server ip on which the action is to be performed. + :type ip: str + :param username: Username to be used for login on the remote server. + :type username: str + :param password: Password to be used for login on the remote server. + :type password: str + :param sudo_password: Sudo password to be used with the comamnd on the remote server. + :type sudo_password: str + :param connection_timeout: Time(in seconds) to wait for the connection to get established with the remote server, + after which the `ConnectionTimeout` exception will be raised. + :type connection_timeout: int + """ + operations.env.host_string = ip + operations.env.user = username + operations.env.password = password + operations.env.sudo_password = sudo_password + operations.env.warn_only = True + operations.env.abort_on_prompts = True + operations.env.disable_known_hosts = True + operations.env.timeout = connection_timeout diff --git a/chaostoolkit_nimble/core/utils/ha_utils.py b/chaostoolkit_nimble/core/utils/ha_utils.py new file mode 100644 index 0000000..18a94b0 --- /dev/null +++ b/chaostoolkit_nimble/core/utils/ha_utils.py @@ -0,0 +1,46 @@ +import logging +import random + +from chaostoolkit_nimble.core.exceptions.custom_exceptions import ChaosActionFailedError +from logzero import logger +from nimble.core.entity.components import Components +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.shell_utils import ShellUtils + +_LOGGER = logging.getLogger(__name__) + + +def check_process_running(component, process_name=None): + ############# To do + ### make this process in shell utils and here call this method only + if not process_name: + process_name = Components.get_process_name(component) + logger.info("Checking if process '%s' is running by fetching its process id." % process_name) + response_list = NodeManager.node_obj.execute_command_on_component(component, + ShellUtils.fetch_process_id(process_name), + consolidated_success_flag=False) + return all([response.stdout != "" for response in response_list]) + + +def kill_process(process_name, component, num_of_nodes=None): + """Kill the process of any particular component + + :param process_name: Name of the process + :type process_name: str + :param component: Name of the component + :type component: str + """ + node_aliases = [] + for node in NodeManager.node_obj.nodes_by_type[component]: + node_aliases.append(node.name) + if num_of_nodes: + node_aliases = random.sample(node_aliases, int(num_of_nodes)) + command = ShellUtils.kill_process_by_name(process_name) + response_list = [] + for node_alias in node_aliases: + logger.debug("Killing process '%s' on node '%s'" % (process_name, node_alias)) + response = NodeManager.node_obj.execute_command_on_node(node_alias, command) + if "kill -9 " not in response.stdout: + raise ChaosActionFailedError("Could not kill process '%s' on node '%s'" % (process_name, node_alias)) + response_list.append(response) + return str(response_list) diff --git a/chaostoolkit_nimble/core/utils/shell_utils.py b/chaostoolkit_nimble/core/utils/shell_utils.py new file mode 100644 index 0000000..b2224b0 --- /dev/null +++ b/chaostoolkit_nimble/core/utils/shell_utils.py @@ -0,0 +1,67 @@ +import logging +import subprocess + +from nimble.core import global_constants + +_LOGGER = logging.getLogger(__name__) + + +class ShellUtils(object): + """Utilities related to linux shell.""" + + def __init__(self, username=global_constants.DEFAULT_SERVER_USERNAME, + password=global_constants.DEFAULT_SERVER_PASSWORD): + """ + + :param username: Username to be used for login on the remote server. Defaults to `root`. + :type username: str + :param password: Password to be used for login on the remote server. Defaults to `root@123`. + :type password: str + """ + self._username = username + self._password = password + + @staticmethod + def log_guavus_response(guavus_response, log_response=True): + """Log the `stdout`, `stderr` and/or `status_code` in `execution.log` from the given `guavus_response`. + + :type guavus_response: :class:`nimble.core.entity.GuavusResponse` + :param log_response: If True, the `stdout`, `stderr` and `status_code` will be logged from the `guavus_response` + else only the `status_code` will be logged. Defaults to `True`. + :type log_response: bool + """ + if log_response: + if guavus_response.status_code == 0: + _LOGGER.info("stdout: %s\nstderr: %s\nstatus code: %s" % ( + guavus_response.stdout, guavus_response.stderr, guavus_response.status_code)) + else: + _LOGGER.error("stdout: %s\nstderr: %s\nstatus code: %s" % ( + guavus_response.stdout, guavus_response.stderr, guavus_response.status_code)) + else: + if guavus_response.status_code == 0: + _LOGGER.info("status code: %s" % guavus_response.status_code) + else: + _LOGGER.error("status code: %s" % guavus_response.status_code) + + @staticmethod + def execute_shell_command(command, log_response=True): + """Execute the given `command` on shell. + + :param command: Command that is to be executed on shell. + :type command: str + :param log_response: If True, the `stdout`, `stderr` and `status_code` will be logged from the `guavus_response` + else only the `status_code` will be logged in `execution.log`. Defaults to `True`. + :type log_response: bool + :return: Return response of the given shell `command`. + :rtype: :class:`nimble.core.entity.GuavusResponse` + """ + _LOGGER.info("Executing command: %s" % command) + ######### subprocess_obj = subprocess.Popen(command, stderr=subprocess.PIPE, stdout=subprocess.PIPE, shell=True) + subprocess_obj = subprocess.Popen(command, stderr=subprocess.PIPE, stdout=subprocess.PIPE, shell=True, + encoding="utf-8") + response_obj = subprocess_obj.communicate() + ######################## + # guavus_response = GuavusResponse(response_obj[0].strip(), response_obj[1], subprocess_obj.returncode) + # ShellUtils.log_guavus_response(guavus_response, log_response=log_response) + # return guavus_response + return response_obj diff --git a/chaostoolkit_nimble/core/utils/spark_apps_ha_utils.py b/chaostoolkit_nimble/core/utils/spark_apps_ha_utils.py new file mode 100644 index 0000000..8e08254 --- /dev/null +++ b/chaostoolkit_nimble/core/utils/spark_apps_ha_utils.py @@ -0,0 +1,80 @@ +import random + +from logzero import logger +from retrying import RetryError + +from chaostoolkit_nimble.controllers.spark import control +from chaostoolkit_nimble.core.exceptions.custom_exceptions import ChaosActionFailedError +from nimble.core.adapters.hadoop.base_hadoop_adapter import ApplicationState +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.components.hadoop_utils import HadoopRestClientUtils +from nimble.core.utils.components.spark_utils import SparkRestClientUtils +from nimble.core.utils.shell_utils import ShellUtils + + +def kill_active_executors(job_name, num_of_exec=1): + hadoop_rest_client_utils = HadoopRestClientUtils() + spark_client_utils = SparkRestClientUtils() + try: + logger.info("Fetching yarn application id for the running job %s." % job_name) + control.APPLICATION_ID = hadoop_rest_client_utils.get_yarn_most_recent_application_id_by_job_name(job_name, + state=ApplicationState.RUNNING.value) + except RetryError: + raise ChaosActionFailedError( + "Could not fetch yarn application id for job %s. Job not found in '%s' state" % ( + job_name, ApplicationState.RUNNING.value)) + try: + logger.info("Fetching spark active executors for application id: %s" % control.APPLICATION_ID) + executors = spark_client_utils.get_application_active_executors(control.APPLICATION_ID) + except RetryError: + raise ChaosActionFailedError( + "Could not fetch spark executors for the application id: %s" % control.APPLICATION_ID) + for i in range(len(executors)): + if executors[i]["id"] == "driver": + executors.pop(i) + break + executors = random.sample(executors, int(num_of_exec)) + response_list = [] + for executor in executors: + executor_id = executor["id"] + node_hostname_domain = executor["hostPort"].split(":")[0] + logger.debug("Killing spark executor id %s on node %s" % (executor_id, node_hostname_domain)) + response = NodeManager.node_obj.execute_command_on_hostname_domain(node_hostname_domain, + ShellUtils.kill_process_by_name("spark", + pipe_command='grep -i "executor-id %s"' % executor_id)) + if "kill -9 " not in response.stdout: + raise ChaosActionFailedError( + "Could not kill process with spark executor id %s on node %s" % (executor_id, node_hostname_domain)) + response_list.append(response) + return str(response_list) + + +def kill_driver(job_name): + hadoop_rest_client_utils = HadoopRestClientUtils() + spark_client_utils = SparkRestClientUtils() + try: + logger.info("Fetching yarn application id for the running job %s." % job_name) + control.APPLICATION_ID = hadoop_rest_client_utils.get_yarn_most_recent_application_id_by_job_name(job_name, + state=ApplicationState.RUNNING.value) + except RetryError: + raise ChaosActionFailedError( + "Could not fetch yarn application id for job %s in state %s:" % (job_name, ApplicationState.RUNNING.value)) + try: + logger.info("Fetching spark driver for application id: %s" % control.APPLICATION_ID) + executors = spark_client_utils.get_application_active_executors(control.APPLICATION_ID) + except RetryError: + raise ChaosActionFailedError( + "Could not fetch spark executors for the application id: %s" % control.APPLICATION_ID) + response = None + for executor in executors: + if executor["id"] == "driver": + node_hostname_domain = executor["hostPort"].split(":")[0] + logger.debug("Killing spark driver on node %s" % node_hostname_domain) + response = NodeManager.node_obj.execute_command_on_hostname_domain(node_hostname_domain, + ShellUtils.kill_process_by_name("spark", + pipe_command='grep -i %s' % control.APPLICATION_ID)) + if "kill -9 " not in response.stdout: + raise ChaosActionFailedError( + "Could not kill spark driver process on node %s" % node_hostname_domain) + break + return str(response) diff --git a/chaostoolkit_nimble/core/utils/yarn_apps_ha_utils.py b/chaostoolkit_nimble/core/utils/yarn_apps_ha_utils.py new file mode 100644 index 0000000..aded380 --- /dev/null +++ b/chaostoolkit_nimble/core/utils/yarn_apps_ha_utils.py @@ -0,0 +1,14 @@ +from logzero import logger +from retrying import RetryError + +from nimble.core.utils.components.hadoop_utils import HadoopRestClientUtils + + +def is_job_running_on_yarn(job_name): + hadoop_rest_client_utils = HadoopRestClientUtils() + logger.debug("Checking if job '%s' running on yarn" % job_name) + try: + return hadoop_rest_client_utils.wait_for_yarn_job_to_start(job_name=job_name) + except RetryError: + logger.info("Not able to fetch yarn job '%s' status." % job_name) + return False diff --git a/chaostoolkit_nimble/resources/components/component_attributes.yml b/chaostoolkit_nimble/resources/components/component_attributes.yml new file mode 100644 index 0000000..057afed --- /dev/null +++ b/chaostoolkit_nimble/resources/components/component_attributes.yml @@ -0,0 +1,61 @@ +defaults: + ui_username: admin + ui_password: Sbe7UGkRBMF8 +components: + HDFS: + linux_user: hdfs + ui_port: 50070 + AMBARI: + ui_port: 8080 + ATLAS: + ui_port: 21000 + CDAP: + ui_port: 11011 + application_port: 11015 + ELASTICSEARCH: + mode: private + ui_port: 9201 + HAPROXY: + ui_port: 5070 + ui_username: haproxy + ui_password: "!H4pr0xy$" + HBASE: + linux_user: hbase + HIVE: + linux_user: hive + application_port: 10000 + SPARK: + linux_user: hdfs + ui_port: 18081 + ZOOKEEPER: + linux_user: zookeeper + application_port: 2181 + SCHEMA_REGISTRY: + ui_port: 7788 + REDIS: + ui_port: 6379,6380 + NIFI: + ui_port: 9090 + OOZIE: + linux_user: oozie + ui_port: 11000 + PSQL: + linux_user: postgres + db_username: postgres + db_password: postgres + application_port: 5432 + KAFKA: + linux_user: kafka + application_port: 6667 + RANGER: + ui_port: 6080 + JANUSGRAPH: + linux_user: janusgraph + application_port: 8182 + PROMETHEUS: + ui_port: 9095 + GRAFANA: + ui_port: 3000 + HADOOP_YARN: + ui_port: 8088 + diff --git a/chaostoolkit_nimble/resources/components/component_attributes_cloudera.yml b/chaostoolkit_nimble/resources/components/component_attributes_cloudera.yml new file mode 100644 index 0000000..ede7145 --- /dev/null +++ b/chaostoolkit_nimble/resources/components/component_attributes_cloudera.yml @@ -0,0 +1,38 @@ +defaults: + linux_user: root +components: + HDFS: + ui_port: 50070 + HIVE: + application_port: 10000 + HBASE: + PSQL: + application_port: 5432 + db_username: postgres + db_password: postgres + AZKABAN: + ui_port: 8507 + ui_username: azkaban + ui_password: "!4zk4b4n$" + ZOOKEEPER: + ui_port: 2181 + KAFKA_CONNECT: + ui_port: 8083 + ELASTICSEARCH: + ui_port: 6775 + HAPROXY: + ui_port: 5070 + ui_username: haproxy + ui_password: "!H4pr0xy$" + SPARK: + ui_port: 18080 + SCHEMA_REGISTRY: + ui_port: 8081 + REDIS: + ui_port: 6379,6380 + IMPALA: + linux_user: impala + db_username: impala + application_port: 21050 + KAFKA: + application_port: 9092 diff --git a/chaostoolkit_nimble/resources/components/component_attributes_kerberos.yml b/chaostoolkit_nimble/resources/components/component_attributes_kerberos.yml new file mode 100644 index 0000000..85bce81 --- /dev/null +++ b/chaostoolkit_nimble/resources/components/component_attributes_kerberos.yml @@ -0,0 +1,64 @@ +defaults: + ui_username: admin + ui_password: Sbe7UGkRBMF8 + keytab_file_path: /etc/security/keytabs/smokeuser.headless.keytab +components: + HDFS: + linux_user: hdfs + ui_port: 50070 + AMBARI: + ui_port: 8080 + ATLAS: + ui_port: 21000 + CDAP: + ui_port: 11011 + application_port: 11015 + ELASTICSEARCH: + mode: vip + ui_port: 9201 + ui_username: elasticsearch + ui_password: elasticsearch@123 + HAPROXY: + ui_port: 5070 + ui_username: haproxy + ui_password: "!H4pr0xy$" + HBASE: + linux_user: hbase + HIVE: + linux_user: hive + application_port: 10000 + SPARK: + linux_user: hdfs + ui_port: 18081 + ZOOKEEPER: + linux_user: zookeeper + application_port: 2181 + SCHEMA_REGISTRY: + ui_port: 7788 + REDIS: + ui_port: 6379,6380 + NIFI: + ui_port: 9090 + OOZIE: + linux_user: oozie + ui_port: 11000 + PSQL: + linux_user: postgres + db_username: postgres + db_password: postgres + application_port: 5432 + KAFKA: + linux_user: kafka + application_port: 6667 + RANGER: + ui_port: 6080 + JANUSGRAPH: + linux_user: janusgraph + application_port: 8182 + PROMETHEUS: + ui_port: 9095 + GRAFANA: + ui_port: 3000 + HADOOP_YARN: + ui_port: 8088 + diff --git a/chaostoolkit_nimble/resources/exp_templates/experiment_template.json b/chaostoolkit_nimble/resources/exp_templates/experiment_template.json new file mode 100644 index 0000000..5d73d30 --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/experiment_template.json @@ -0,0 +1,44 @@ +{ + "version": "1.0.0", + "title": "What is the impact on an ES service when the ES running pod goes down?", + "description": "If ES running pod goes down, the ES service should still be functioning properly.", + "tags": ["kubernetes"], + "controls": [{ + "name": "my-controls", + "provider": { + "type": "python", + "module": "custom.controllers.control" + } + }], + "steady-state-hypothesis": { + "title": "Application responds", + "probes": [{ + "type": "probe", + "name": "test_run_remote_python_cmd", + "tolerance": "hi\n", + "provider": { + "module": "custom.fabric_utils", + "type": "python", + "func": "run_command_on_remote", + "arguments": { + "command": "{{my_var}}", + "ip": "192.168.135.35", + "username":"root", + "password":"guavus@123", + "command_timeout":120, + "connection_timeout": 120 + } + } + }] + }, + "method": [{ + "type": "action", + "name": "test_method", + "provider": { + "module": "chaosk8s.pod.actions", + "type": "process", + "path": "ls", + "arguments": "/tmp" + } + }] +} diff --git a/chaostoolkit_nimble/resources/exp_templates/kubernetes_experiment_template.json b/chaostoolkit_nimble/resources/exp_templates/kubernetes_experiment_template.json new file mode 100644 index 0000000..bf1df42 --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/kubernetes_experiment_template.json @@ -0,0 +1,73 @@ +{ + "version": "1.0.0", + "title": "Does kubernetes handle pod {{pod_name}} failover?", + "description": "If the pod {{pod_name}} goes down, then kubenetes should handle its restart garcefully and the service it is running should get restored successfully.", + "tags": ["kubernetes"], + "controls": [{ + "name": "my-controls", + "provider": { + "type": "python", + "module": "mypackage.custom.controllers.control" + } + }], + "steady-state-hypothesis": { + "title": "Pod {{pod_name}} and its service {{service_name}} is up and running", + "probes": [{ + "type": "probe", + "name": "{{pod_name}}_pod_is_running", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.chaosk8s.pod.probes", + "type": "python", + "func": "pods_in_phase", + "arguments": { + "ns": "{{pod_ns_var}}", + "phase" : "Running", + "label_selector" : "{{pod_label_selector_var}}" + } + } + }, + { + "type": "probe", + "name": "{{service_name}}_service_exists", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.chaosk8s.probes", + "type": "python", + "func": "service_endpoint_is_initialized", + "arguments": { + "name": "{{service_name}}", + "ns": "{{service_ns_var}}", + "label_selector" : "{{service_label_selector_var}}" + } + } + }] + }, + "method": [{ + "type": "action", + "name": "terminate_gracefully_pod_{{pod_name}}", + "provider": { + "module": "chaostoolkit_nimble.chaosk8s.pod.actions", + "type": "python", + "func": "terminate_pods", + "arguments": { + "name_pattern": "{{pod_name}}*", + "ns": "{{pod_ns_var}}", + "grace_period": 5 + } + } + }, + { + "type": "probe", + "name": "read_new_spawned_logs_for_pod_{{pod_name}}", + "provider": { + "module": "chaostoolkit_nimble.chaosk8s.pod.probes", + "type": "python", + "func": "read_pod_logs", + "arguments": { + "ns": "{{pod_ns_var}}", + "label_selector" : "{{pod_label_selector_var}}" + } + } + }] +} diff --git a/chaostoolkit_nimble/resources/exp_templates/process/exp.json b/chaostoolkit_nimble/resources/exp_templates/process/exp.json new file mode 100644 index 0000000..b492aac --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/process/exp.json @@ -0,0 +1,54 @@ +{ + "version": "1.0.0", + "title": "Experiment with killing a process on system", + "description": "The process should auto-respawn after being killed.", + "tags": [ + "process" + ], + "controls": [ + { + "name": "process-related-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.process.control" + } + } + ], + "steady-state-hypothesis": { + "title": "Process {{rand_dynamic_process_name}} is up and running", + "probes": [ + { + "type": "probe", + "name": "Fetch-process-id", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "check_process_running", + "arguments": { + "component": "{{rand_dynamic_component}}", + "process_name": "{{rand_dynamic_process_name}}" + } + } + } + ] + }, + "method": [ + { + "type": "action", + "name": "Kill-process", + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "kill_process", + "arguments": { + "component": "{{rand_dynamic_component}}", + "process_name": "{{rand_dynamic_process_name}}", + "num_of_nodes": "1" + } + } + } + ], + "rollbacks": [ + ] +} \ No newline at end of file diff --git a/chaostoolkit_nimble/resources/exp_templates/process/exp_dynamic.json b/chaostoolkit_nimble/resources/exp_templates/process/exp_dynamic.json new file mode 100644 index 0000000..e1ce8f8 --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/process/exp_dynamic.json @@ -0,0 +1,44 @@ +{ + "version": "1.0.0", + "title": "Experiment with killing a process on system", + "description": "The process should auto-respawn after being killed.", + "tags": ["process"], + "controls": [{ + "name": "process-related-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.process.control" + } + }], + "steady-state-hypothesis": { + "title": "Process is up and running", + "probes": [{ + "type": "probe", + "name": "Fetch-process-id", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "check_process_running", + "arguments": { + "component": "rand_dynamic_component", + "process_name": "rand_dynamic_process_name" + } + } + }] + }, + "method": [{ + "type": "action", + "name": "Kill-process", + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "kill_process", + "arguments": { + "component": "rand_dynamic_component", + "process_name": "rand_dynamic_process_name", + "num_of_nodes": "1" + } + } + }] +} diff --git a/chaostoolkit_nimble/resources/exp_templates/shell_app/process_experiment_template.json b/chaostoolkit_nimble/resources/exp_templates/shell_app/process_experiment_template.json new file mode 100644 index 0000000..dd265cb --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/shell_app/process_experiment_template.json @@ -0,0 +1,43 @@ +{ + "version": "1.0.0", + "title": "Experiment running shell commands on local and remote", + "description": "Shell command should get successfully executed on remote using python provider through fabric and on local it should run through process provider", + "tags": ["kubernetes"], + "controls": [{ + "name": "my-controls", + "provider": { + "type": "python", + "module": "mypackage.custom.controllers.control" + } + }], + "steady-state-hypothesis": { + "title": "Shell command runs on remote server", + "probes": [{ + "type": "probe", + "name": "run_shell_cmd_on_remote", + "tolerance": "{{expected_remote_command_output}}", + "provider": { + "module": "chaostoolkit_nimble.custom.fabric_utils", + "type": "python", + "func": "run_command_on_remote", + "arguments": { + "command": "{{remote_command_var}}", + "ip": "{{remote_ip_var}}", + "username":"{{remote_username_var}}", + "password":"{{remote_password_var}}", + "command_timeout": 120, + "connection_timeout": 120 + } + } + }] + }, + "method": [{ + "type": "action", + "name": "run_shell_cmd_on_local", + "provider": { + "type": "process", + "path": "{{local_command_var}}", + "arguments": "{{local_command_arguments}}" + } + }] +} diff --git a/chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp.json b/chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp.json new file mode 100644 index 0000000..873157e --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp.json @@ -0,0 +1,44 @@ +{ + "version": "1.0.0", + "title": "Experiment running shell commands on local and remote", + "description": "Shell command should get successfully executed on remote using python provider through fabric and on local it should run through process provider", + "tags": ["kubernetes"], + "controls": [{ + "name": "shell-app-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.extensions.shell_app.control" + } + }], + "steady-state-hypothesis": { + "title": "Shell application is up and running", + "probes": [{ + "type": "probe", + "name": "Fetch-shell-app-process-id", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "check_process_running", + "arguments": { + "component": "rand_dynamic_component", + "process_name": "rand_dynamic_process_name" + } + } + }] + }, + "method": [{ + "type": "action", + "name": "Kill-shell-app", + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "kill_process", + "arguments": { + "component": "rand_dynamic_component", + "process_name": "rand_dynamic_process_name", + "num_of_nodes": "1" + } + } + }] +} diff --git a/chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp_template.json b/chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp_template.json new file mode 100644 index 0000000..1eb410a --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp_template.json @@ -0,0 +1,43 @@ +{ + "version": "1.0.0", + "title": "Experiment running shell commands on local and remote", + "description": "Shell command should get successfully executed on remote using python provider through fabric and on local it should run through process provider", + "tags": ["kubernetes"], + "controls": [{ + "name": "shell-app-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.extensions.shell_application.control" + } + }], + "steady-state-hypothesis": { + "title": "Shell application is up and running", + "probes": [{ + "type": "probe", + "name": "Fetch-application-process-id", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "check_process_running", + "arguments": { + "component": "{{component}}", + "process_name": "{{process_name}}" + } + } + }] + }, + "method": [{ + "type": "action", + "name": "Kill-application-process", + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "kill_process", + "arguments": { + "component": "{{component}}", + "process_name": "{{process_name}}" + } + } + }] +} diff --git a/chaostoolkit_nimble/resources/exp_templates/spark/__init__.py b/chaostoolkit_nimble/resources/exp_templates/spark/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/resources/exp_templates/spark/driver_and_executor_kill_exp.json b/chaostoolkit_nimble/resources/exp_templates/spark/driver_and_executor_kill_exp.json new file mode 100644 index 0000000..7d0c997 --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/spark/driver_and_executor_kill_exp.json @@ -0,0 +1,64 @@ +{ + "version": "1.0.0", + "title": "Experiment with killing the spark driver and n number of executors for the spark job {{job_name}}.", + "description": "The spark driver and executor process should get auto-respawned after being killed.", + "tags": [ + "spark" + ], + "controls": [ + { + "name": "spark-related-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.spark.control" + } + } + ], + "steady-state-hypothesis": { + "title": "Job {{job_name}} is up and running on yarn.", + "probes": [ + { + "type": "probe", + "name": "Check job {{job_name}} is running on yarn ", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.yarn_apps_ha_utils", + "type": "python", + "func": "is_job_running_on_yarn", + "arguments": { + "job_name": "{{job_name}}" + } + } + } + ] + }, + "method": [ + { + "type": "action", + "name": "Kill spark driver for the job {{job_name}} ", + "provider": { + "module": "chaostoolkit_nimble.core.utils.spark_apps_ha_utils", + "type": "python", + "func": "kill_driver", + "arguments": { + "job_name": "{{job_name}}" + } + } + }, + { + "type": "action", + "name": "Kill spark executors for job {{job_name}} ", + "provider": { + "module": "chaostoolkit_nimble.core.utils.spark_apps_ha_utils", + "type": "python", + "func": "kill_active_executors", + "arguments": { + "job_name": "{{job_name}}", + "num_of_exec": "{{ num_of_exec_to_kill|default('1') }}" + } + } + } + ], + "rollbacks": [ + ] +} \ No newline at end of file diff --git a/chaostoolkit_nimble/resources/exp_templates/spark/driver_kill_exp.json b/chaostoolkit_nimble/resources/exp_templates/spark/driver_kill_exp.json new file mode 100644 index 0000000..ed91767 --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/spark/driver_kill_exp.json @@ -0,0 +1,51 @@ +{ + "version": "1.0.0", + "title": "Experiment with killing the spark driver for the spark job {{job_name}}.", + "description": "The spark driver should get auto-respawned after being killed.", + "tags": [ + "spark" + ], + "controls": [ + { + "name": "spark-related-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.spark.control" + } + } + ], + "steady-state-hypothesis": { + "title": "Job {{job_name}} is up and running on yarn.", + "probes": [ + { + "type": "probe", + "name": "Check job {{job_name}} running on yarn. ", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.yarn_apps_ha_utils", + "type": "python", + "func": "is_job_running_on_yarn", + "arguments": { + "job_name": "{{job_name}}" + } + } + } + ] + }, + "method": [ + { + "type": "action", + "name": "Kill spark driver for job {{job_name}}. ", + "provider": { + "module": "chaostoolkit_nimble.core.utils.spark_apps_ha_utils", + "type": "python", + "func": "kill_driver", + "arguments": { + "job_name": "{{job_name}}" + } + } + } + ], + "rollbacks": [ + ] +} \ No newline at end of file diff --git a/chaostoolkit_nimble/resources/exp_templates/spark/executor_kill_exp.json b/chaostoolkit_nimble/resources/exp_templates/spark/executor_kill_exp.json new file mode 100644 index 0000000..6540c0c --- /dev/null +++ b/chaostoolkit_nimble/resources/exp_templates/spark/executor_kill_exp.json @@ -0,0 +1,52 @@ +{ + "version": "1.0.0", + "title": "Experiment with killing n number of spark executors for a spark job {{job_name}}.", + "description": "The spark executor process/es should get auto-respawned after being killed.", + "tags": [ + "spark" + ], + "controls": [ + { + "name": "spark-related-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.spark.control" + } + } + ], + "steady-state-hypothesis": { + "title": "Job {{job_name}} is up and running on yarn.", + "probes": [ + { + "type": "probe", + "name": "Check job {{job_name}} running on yarn. ", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.yarn_apps_ha_utils", + "type": "python", + "func": "is_job_running_on_yarn", + "arguments": { + "job_name": "{{job_name}}" + } + } + } + ] + }, + "method": [ + { + "type": "action", + "name": "Kill active spark executors for job {{job_name}}. ", + "provider": { + "module": "chaostoolkit_nimble.core.utils.spark_apps_ha_utils", + "type": "python", + "func": "kill_active_executors", + "arguments": { + "job_name": "{{job_name}}", + "num_of_exec": "{{ num_of_exec_to_kill|default('1') }}" + } + } + } + ], + "rollbacks": [ + ] +} \ No newline at end of file diff --git a/chaostoolkit_nimble/resources/security/krb5.conf b/chaostoolkit_nimble/resources/security/krb5.conf new file mode 100644 index 0000000..92a3395 --- /dev/null +++ b/chaostoolkit_nimble/resources/security/krb5.conf @@ -0,0 +1,26 @@ + +[libdefaults] + renew_lifetime = 7d + forwardable = true + default_realm = GVS.GGN + ticket_lifetime = 24h + dns_lookup_realm = false + dns_lookup_kdc = false + default_ccache_name = /tmp/krb5cc_%{uid} + #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 + #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 + +[domain_realm] + GVS.GGN = GVS.GGN + +[logging] + default = FILE:/var/log/krb5kdc.log + admin_server = FILE:/var/log/kadmind.log + kdc = FILE:/var/log/krb5kdc.log + +[realms] + GVS.GGN = { + admin_server = kdc-devopslabs.gvs.ggn + kdc = kdc-devopslabs.gvs.ggn + } + diff --git a/chaostoolkit_nimble/resources/testbeds/open_nebula_135_35.yml b/chaostoolkit_nimble/resources/testbeds/open_nebula_135_35.yml new file mode 100644 index 0000000..bde2ac1 --- /dev/null +++ b/chaostoolkit_nimble/resources/testbeds/open_nebula_135_35.yml @@ -0,0 +1,51 @@ +# You can choose to skip some attributes that are given in default tag if the value for that VM is same as that in the +# default tag. + +defaults: + username: root + password: guavus@123 + dynamic_vm_ips: True + vip_hostname_domain: testautomation-lb-vip.cloud.in.guavus.com + vip: 192.168.135.34 + nameservice: testautomation-reflex-platform + stage: dev + inventory_name: testautomation + customer_name: guavus + platform: ambari + kerberized: False + vm_netmask: 255.255.252.0 + vm_gateway: 192.168.132.1 + type: DATANODE,ETCD,HREGION,HBASE,ZOOKEEPER,REDIS,REDIS_MASTER,REDIS_SLAVE,HADOOP_HDFS,HADOOP_YARN,KAFKA,KAFKA_BROKER,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,NIFI,NODE_MANAGER,HIVE,HIVE_SERVER,CORE_DNS,KAFKA_MM,SNMP +bare_metals: + BM1: + ip: dummy + vm: + testautomation-mgt-01: + vm_ip: 192.168.135.35 + type: MANAGEMENT,PSQL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,JOURNALNODE,ATLAS,RANGER,RANGER_ADMIN,RANGER_USERSYNC,RANGER_TAGSYNC,HISTORY_SERVER,SPARK2_HISTORY_SERVER,ELASTICSEARCH,JANUSGRAPH,AMBARI,GRAFANA,SNMP + hostname_domain: testautomation-mgt-01.cloud.in.guavus.com + testautomation-lb-01: + vm_ip: 192.168.135.42 + type: LB,HAPROXY,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,COROSYNC,PACEMAKER,SNMP + hostname_domain: testautomation-lb-01.cloud.in.guavus.com + testautomation-lb-02: + vm_ip: 192.168.135.43 + type: LB,HAPROXY,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,COROSYNC,PACEMAKER,SNMP + hostname_domain: testautomation-lb-02.cloud.in.guavus.com + testautomation-slv-01: + vm_ip: 192.168.135.38 + hostname_domain: testautomation-slv-01.cloud.in.guavus.com + testautomation-mst-01: + vm_ip: 192.168.135.36 + type: NAMENODE,KUBERNETES,HYPERKUBE_SCHEDULER,HYPERKUBE_APISERVER,HYPERKUBE_CONTROLLER_MANAGER,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,HMASTER,HIVE,HADOOP_HDFS,HADOOP_YARN,ZKFC,HBASE,SPARK,SCHEMA_REGISTRY,PGHA_WHATCHDOG,RESOURCE_MANAGER,JOURNALNODE,KUBERNETES_DASHBOARD,HIVE_METASTORE,CDAP,CDAP_AUTH_SERVER,CDAP_MASTER,CDAP_ROUTER,CDAP_UI + hostname_domain: testautomation-mst-01.cloud.in.guavus.com + testautomation-slv-02: + vm_ip: 192.168.135.39 + hostname_domain: testautomation-slv-02.cloud.in.guavus.com + testautomation-mst-02: + vm_ip: 192.168.135.37 + type: NAMENODE,KUBERNETES,HYPERKUBE_SCHEDULER,HYPERKUBE_APISERVER,HYPERKUBE_CONTROLLER_MANAGER,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,HMASTER,HIVE,HADOOP_HDFS,HADOOP_YARN,ZKFC,HBASE,SPARK,PGHA_WHATCHDOG,RESOURCE_MANAGER,JOURNALNODE,KUBERNETES_DASHBOARD,HIVE_METASTORE,CDAP,CDAP_AUTH_SERVER,CDAP_MASTER,CDAP_ROUTER,CDAP_UI + hostname_domain: testautomation-mst-02.cloud.in.guavus.com + testautomation-slv-03: + vm_ip: 192.168.135.40 + hostname_domain: testautomation-slv-03.cloud.in.guavus.com \ No newline at end of file diff --git a/chaostoolkit_nimble/resources/testbeds/open_nebula_135_52.yml b/chaostoolkit_nimble/resources/testbeds/open_nebula_135_52.yml new file mode 100644 index 0000000..0884650 --- /dev/null +++ b/chaostoolkit_nimble/resources/testbeds/open_nebula_135_52.yml @@ -0,0 +1,52 @@ +# You can choose to skip some attributes that are given in default tag if the value for that VM is same as that in the +# default tag. + +defaults: + username: root + password: guavus@123 + dynamic_vm_ips: True + vip_hostname_domain: testautomation002-lb-vip.cloud.in.guavus.com + vip: 192.168.134.191 + nameservice: testautomation-reflex-platform + stage: dev + inventory_name: testautomation + customer_name: guavus + platform: ambari + kerberized: True + kerberos_client_user: ambari-qa + vm_netmask: 255.255.252.0 + vm_gateway: 192.168.132.1 + type: DATANODE,ETCD,HREGION,HBASE,ZOOKEEPER,REDIS,REDIS_MASTER,REDIS_SLAVE,HADOOP_HDFS,HADOOP_YARN,KAFKA,KAFKA_BROKER,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,NIFI,NODE_MANAGER,HIVE,HIVE_SERVER,CORE_DNS,KAFKA_MM,SNMP +bare_metals: + BM1: + ip: dummy + vm: + testautomation002-mgt-01: + vm_ip: 192.168.135.59 + type: MANAGEMENT,PSQL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,JOURNALNODE,ATLAS,RANGER,RANGER_ADMIN,RANGER_USERSYNC,RANGER_TAGSYNC,HISTORY_SERVER,SPARK2_HISTORY_SERVER,ELASTICSEARCH,JANUSGRAPH,AMBARI,GRAFANA,SNMP + hostname_domain: testautomation002-mgt-01.cloud.in.guavus.com + testautomation002-lb-01: + vm_ip: 192.168.135.186 + type: LB,HAPROXY,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,COROSYNC,PACEMAKER,SNMP + hostname_domain: testautomation002-lb-01.cloud.in.guavus.com + testautomation002-lb-02: + vm_ip: 192.168.135.187 + type: LB,HAPROXY,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,COROSYNC,PACEMAKER,SNMP + hostname_domain: testautomation002-lb-02.cloud.in.guavus.com + testautomation002-slv-01: + vm_ip: 192.168.135.183 + hostname_domain: testautomation002-slv-01.cloud.in.guavus.com + testautomation002-mst-01: + vm_ip: 192.168.135.60 + type: NAMENODE,KUBERNETES,HYPERKUBE_SCHEDULER,HYPERKUBE_APISERVER,HYPERKUBE_CONTROLLER_MANAGER,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,HMASTER,HIVE,HADOOP_HDFS,HADOOP_YARN,ZKFC,HBASE,SPARK,SCHEMA_REGISTRY,PGHA_WHATCHDOG,RESOURCE_MANAGER,JOURNALNODE,KUBERNETES_DASHBOARD,HIVE_METASTORE,CDAP,CDAP_AUTH_SERVER,CDAP_MASTER,CDAP_ROUTER,CDAP_UI + hostname_domain: testautomation002-mst-01.cloud.in.guavus.com + testautomation002-slv-02: + vm_ip: 192.168.135.184 + hostname_domain: testautomation002-slv-02.cloud.in.guavus.com + testautomation002-mst-02: + vm_ip: 192.168.135.64 + type: NAMENODE,KUBERNETES,HYPERKUBE_SCHEDULER,HYPERKUBE_APISERVER,HYPERKUBE_CONTROLLER_MANAGER,FLANNEL,DOCKER,HYPERKUBE_PROXY,HYPERKUBE_KUBELET,HMASTER,HIVE,HADOOP_HDFS,HADOOP_YARN,ZKFC,HBASE,SPARK,PGHA_WHATCHDOG,RESOURCE_MANAGER,JOURNALNODE,KUBERNETES_DASHBOARD,HIVE_METASTORE,CDAP,CDAP_AUTH_SERVER,CDAP_MASTER,CDAP_ROUTER,CDAP_UI + hostname_domain: testautomation002-mst-02.cloud.in.guavus.com + testautomation002-slv-03: + vm_ip: 192.168.135.185 + hostname_domain: testautomation002-slv-03.cloud.in.guavus.com \ No newline at end of file diff --git a/chaostoolkit_nimble/resources/validation/sample_validation_config.yml b/chaostoolkit_nimble/resources/validation/sample_validation_config.yml new file mode 100644 index 0000000..8636604 --- /dev/null +++ b/chaostoolkit_nimble/resources/validation/sample_validation_config.yml @@ -0,0 +1,169 @@ +defaults: + project: platform + build: nightly + golden_build: nightly + customer: thales + mail_to: kritika.saxena@guavus.com,Jyoti.Arora@guavus.com,samarth.goel@guavus.com + headers: True + delimiter: "," + aggregate: False + sort_columns: False + config_separator: "=" +jobs: + hive_to_hive: + input: + input1: + source: hive + db_name: randynamic_db_name + table_name: randynamic_table_name + where_clause: + select_statement: '*' + group_by_clause: + output: + output1: + actual: + source: hive + db_name: aut_squad_hive_db + table_name: aut_squad_hive_table + where_clause: + ignore_columns: appnameid,deviceid,searchkeyword,httpreferer,deviceosversion,gds,sessionidtype4,sessionidtype3,sessionidtype2,sessionidtype1,duration,mcc,ci,att_content_type,accountid,deviceos,imei,spt2category,contenttype,tier1_id,contentupbytes,location8,devicemanufacturer,location7,location6,location5,tier2_id,location4,location3,location2,location1,contentdownbytes,att_os,mobileapp,wcsubid,recordendtime,recordendtimedateformat,apn,mnc,timezone,att_application,contenttypeid,mobileappcategory,url,appprotocol,formatedrat,att_device,ipprotocol,destinationip,ntc_id,devicename,lac,httpmethod,adidtimestamp,sourceip,att_domain,uaidagent,adid,ods3,ods2,ods1,devicegroup,responsecode,lineformat,att_sp,spt1category,serviceprovider,destinationport,downpackets,host,roamingcountry,topologylevel2,sourceport,downbytes,topologylevel1,roamingoperator,recordstarttime,recordstarttimedateformat,time + measures: output_type + psql_to_psql: + input: + input1: + source: psql + db_name: &psql_to_psql_input_db aut_squad_pgsql_db + table_name: &psql_to_psql_input_table aut_squad_pgsql_table + select_statement: '*' + where_clause: + group_by_clause: + output: + output1: + actual: + source: psql + db_name: aut_squad_pgsql_db + table_name: aut_squad_pgsql_table + where_clause: + ignore_columns: id + measures: description,name,url,displayname + hdfs_to_hdfs_parquet: + input: + input1: + source: hdfs + base_path: &hdfs_to_hdfs_parquet_input_base_path /tmp/aut_squad_dir/parquet + file_format: parquet + delimiter: "\t" + compression: gz + headers: False + output: + output1: + actual: + source: hdfs + base_path: /tmp/aut_squad_dir/parquet + aggregation_level: 0 + file_format: parquet + delimiter: "\t" + compression: gz + headers: False + &hdfs_csv hdfs_to_hdfs_csv: +# configs: +# config1: +# location: local +# base_path: "/tmp/aut_squad_dir/automation_squad.yml" +# file_format: yaml +# components: +# - MANAGEMENT +# - NAMENODE +# properties: +# "[2].tasks[0].docker_image.name": "image3" +# schedule: +# source: api +# module: nimble.actions.sample.sample_actions +# class: +# class_name: SampleActions +# class_parameters: +# - *hdfs_csv +# method: +# method_name: schedule_job_via_api + input: + input1: + source: hdfs + base_path: /tmp/aut_squad_dir/csv + file_format: csv + file_name_flag: False + output: + output1: + actual: + source: hdfs + base_path: /tmp/aut_squad_dir/csv + file_format: csv + ignore_diff_column_indexes: 1,3 + output2: + actual: + source: hdfs + base_path: /tmp/aut_squad_dir/csv + file_format: csv + ignore_diff_column_indexes: 1,3 + local_to_hdfs: + input: + input1: + source: local + base_path: /tmp/aut_squad_dir/csv + file_format: csv + file_name_flag: True + components: + - MASTER_NAMENODE + output: + output1: + actual: + source: hdfs + base_path: &local_to_hdfs_output_base_path /tmp/aut_squad_dir/csv + file_format: csv + fileserver_to_hdfs: + input: + input1: + source: file_server + base_path: modules/platform/validation/aut_squad_test_data/input_output_sources/fileserver/csv/ + file_format: csv + output: + output1: + actual: + source: hdfs + base_path: &fileserver_to_hdfs_output_base_path /tmp/aut_squad_dir/csv + file_format: csv + elasticsearch_to_elasticsearch: + input: + input1: + source: elasticsearch + es_index: randynamic_es_index + output: + output1: + actual: + source: elasticsearch + es_index: randynamic_es_index + hbase_to_hbase: + input: + input1: + source: hbase + table_name: randynamic_table_name + scan_filter: FirstKeyOnlyFilter() + output: + output1: + actual: + source: hbase + table_name: aut_squad_hbase_table + scan_filter: FirstKeyOnlyFilter() + Media_Plane: + input: + input1: + source: hdfs + base_path: /tmp/partition_date=2019-07-20 + file_format: csv + file_name_flag: False + delimiter: "," + output: + output1: + actual: + source: hive + db_name: network360_volte + table_name: media_plane_table_chaos \ No newline at end of file diff --git a/chaostoolkit_nimble/tests/__init__.py b/chaostoolkit_nimble/tests/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/tests/conftest.py b/chaostoolkit_nimble/tests/conftest.py new file mode 100644 index 0000000..1414916 --- /dev/null +++ b/chaostoolkit_nimble/tests/conftest.py @@ -0,0 +1,136 @@ +# Do not change the name of the file, since pytest detects it with this name only. + +import logging.config +import os + +import pytest +from nimble.core import global_constants +from nimble.core.utils.shell_utils import ShellUtils + +try: + os.makedirs(global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH) +except Exception: + pass + +logging.config.fileConfig(global_constants.DEFAULT_LOGGING_FILE_PATH) + +# pylint: disable=wrong-import-position +from nimble.core.configs.validation_config_parser import ValidationConfigParser +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.report_utils import ReportUtils +from nimble.actions.base.flows.user_actions import UserActions + +OPTIONS_DICT = {} +PREVIOUS_FAILED = None +ITEM_LIST = [] + + +def pytest_addoption(parser): + parser.addoption("--testbed", + help="Relative path (to the project root) of the testbed file. E.g. python -m pytest --testbed=resources/testbeds/open_nebula_134_192.yml") + parser.addoption("--componentAttributesConfig", + help="Relative path (to the project root) of the file containing component attributes configs. E.g. python -m pytest --componentAttributesConfig=resources/components/component_attributes_ambari.yml") + parser.addoption("--validationConfig", + help="Relative path (to the project root) of the file containing validation configs. E.g. python -m pytest --validationConfig=resources/validation/sample_validation_config.yml") + parser.addoption("--experimentsPath", + help="Relative path (to the project root) of the file containing chaos experiment json files. E.g. python -m pytest --validationConfig=resources/validation/chaos_exp_config.yml") + + +@pytest.fixture(scope="session", autouse=True) +def initialize_node_obj(request): + testbed_file = request.config.getoption("--testbed") + component_arttributes_file = request.config.getoption("--componentAttributesConfig") + if not component_arttributes_file: + component_arttributes_file = "nimble/resources/components/component_attributes.yml" + setup_files_base_path = "%s/setup" % global_constants.DEFAULT_LOCAL_TMP_PATH + if testbed_file: + NodeManager.initialize(testbed_file, component_arttributes_file) + ShellUtils.execute_shell_command( + ShellUtils.remove_and_create_directory(setup_files_base_path)) + testbed_file_tmp_path = "%s/%s" % (setup_files_base_path, testbed_file.rsplit("/", 1)[1]) + component_arttributes_file_tmp_path = "%s/%s" % ( + setup_files_base_path, component_arttributes_file.rsplit("/", 1)[1]) + ShellUtils.execute_shell_command(ShellUtils.copy(testbed_file, testbed_file_tmp_path)) + ShellUtils.execute_shell_command( + ShellUtils.copy(component_arttributes_file, component_arttributes_file_tmp_path)) + yield + ShellUtils.execute_shell_command(ShellUtils.remove(setup_files_base_path, recursive=True)) + + +@pytest.fixture(scope="session", autouse=True) +def initialize_arguments(request): + global OPTIONS_DICT + + for option, value in list(request.config.option.__dict__.items()): + OPTIONS_DICT[option] = value + + +@pytest.fixture(scope="session") +def config_parser(initialize_arguments): # pylint: disable=redefined-outer-name,unused-argument + """Initialize the validation config parser. + + :param initialize_arguments: Fixture defined above. + :return: Return the object of the Validation config parser. + :rtype: :class:`nimble.core.configs.validation_config_parser.ValidationConfigParser` + """ + return ValidationConfigParser(OPTIONS_DICT["validationConfig"]) + + +@pytest.fixture(scope="session") +def dump_allure_env_file(config_parser, initialize_node_obj): # pylint: disable=redefined-outer-name,unused-argument + """Dump the basic environment variables for Allure. + + :param config_parser: Fixture defined above. + :param initialize_node_obj: Fixture defined above. + """ + report_dict = ReportUtils.get_generic_attributes(config_parser) + ReportUtils.dump_allure_env_file(report_dict) + + +@pytest.fixture(scope="session") +def user_actions(config_parser, dump_allure_env_file): # pylint: disable=redefined-outer-name,unused-argument + """Initialize the object for user actions. + + :param config_parser: Fixture defined above. + :param dump_allure_env_file: Fixture defined above. + :rtype: :class:`nimble.actions.base.flows.user_actions.UserActions` + """ + return UserActions(config_parser) + + +def pytest_runtest_makereport(item, call): + """ + Sometimes you may have a testing situation which consists of a series of test steps. If one step fails it makes no + sense to execute further steps as they are all expected to fail anyway and their tracebacks add no insight. + This and the next hook implementations work together to abort incremental-marked tests in a class. + + + :param item: Pytest's internal fixture. + :param call: Pytest's internal fixture. + """ + if "incremental" in item.keywords: + if call.excinfo is not None: + parent = item.parent + parent._previousfailed = item # pylint: disable=protected-access + + +def pytest_runtest_setup(item): + """ + Sometimes you may have a testing situation which consists of a series of test steps. If one step fails it makes no + sense to execute further steps as they are all expected to fail anyway and their tracebacks add no insight. + This and the next hook implementations work together to abort incremental-marked tests in a class. + + :param item: Pytest's internal fixture. + """ + global PREVIOUS_FAILED, ITEM_LIST + if "incremental" in item.keywords: + for previous_item in ITEM_LIST: + if PREVIOUS_FAILED is None: + PREVIOUS_FAILED = getattr(previous_item.parent, "_previousfailed", None) + if PREVIOUS_FAILED is not None: + pytest.fail("previous test failed (%s)" % PREVIOUS_FAILED.name) + try: + ITEM_LIST.pop(0) + except IndexError: + pass + ITEM_LIST.append(item) diff --git a/tests/fixtures/invalid-k8s.txt b/chaostoolkit_nimble/tests/fixtures/invalid-k8s.txt similarity index 100% rename from tests/fixtures/invalid-k8s.txt rename to chaostoolkit_nimble/tests/fixtures/invalid-k8s.txt diff --git a/chaostoolkit_nimble/tests/sample/__init__.py b/chaostoolkit_nimble/tests/sample/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/chaostoolkit_nimble/tests/sample/test_jio.py b/chaostoolkit_nimble/tests/sample/test_jio.py new file mode 100644 index 0000000..dcddcff --- /dev/null +++ b/chaostoolkit_nimble/tests/sample/test_jio.py @@ -0,0 +1,39 @@ +import pytest +from chaostoolkit_nimble.actions.base.flows import chaos_user_actions +from chaostoolkit_nimble.actions.jio.media_plane_actions import MediaPlaneActions +from nimble.core.entity.components import Components +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.shell_utils import ShellUtils + + +@pytest.mark.incremental +class TestJio(): + job_alias = "Media_Plane" + job_user = "ambari-qa" + job_running_component = Components.MANAGEMENT.name + + @pytest.fixture(scope="session") + def media_plane_actions(self, config_parser): + return MediaPlaneActions(self.job_alias, config_parser, job_user=self.job_user) + + @pytest.fixture(scope="session", autouse=True) + def clean_table(self, media_plane_actions): + command = "hive -e 'drop table if exists %s.%s'" % ( + media_plane_actions.database_name, media_plane_actions.table_name) + assert NodeManager.node_obj.execute_command_on_node(media_plane_actions.node_alias, + ShellUtils.su(self.job_user, command)).status + + def test_schedule_15min_job(self, media_plane_actions): + assert media_plane_actions.schedule_15_min_job() + + def test_perform_15min_job_ha(self): + exp_template_file = "process/exp.json" + context = {"rand_dynamic_component": self.job_running_component, + "rand_dynamic_process_name": self.job_alias, + # "rand_dynamic_process_name": "sleep 1h", + } + chaos_user_actions.run_experiment(exp_template_file=exp_template_file, context=context) + # chaos_user_actions.run_experiment(exp_file=OPTIONS_DICT["experimentsPath"]) + + def test_validation_on_15min_job_ha(self, user_actions, media_plane_actions): + user_actions.validate(media_plane_actions.validate_media_plane, self.job_alias) diff --git a/chaostoolkit_nimble/tests/sample/test_jio_spark_job.py b/chaostoolkit_nimble/tests/sample/test_jio_spark_job.py new file mode 100644 index 0000000..a2574cd --- /dev/null +++ b/chaostoolkit_nimble/tests/sample/test_jio_spark_job.py @@ -0,0 +1,57 @@ +import logging + +import pytest + +from chaostoolkit_nimble.actions.base.flows import chaos_user_actions +from chaostoolkit_nimble.actions.jio.media_plane_actions import MediaPlaneActions +from nimble.core.entity.components import Components +from nimble.core.entity.node_manager import NodeManager +from nimble.core.utils.shell_utils import ShellUtils + +_LOGGER = logging.getLogger(__name__) + + +@pytest.mark.incremental +class TestJioSparkJob(): + job_alias = "Media_Plane" + job_user = "ambari-qa" + job_running_component = Components.MANAGEMENT.name + + @pytest.fixture(scope="session") + def media_plane_actions(self, config_parser): + return MediaPlaneActions(self.job_alias, config_parser, job_user=self.job_user) + + @pytest.fixture(scope="session") + def clean_table(self, media_plane_actions): + command = "hive -e 'drop table if exists %s.%s'" % ( + media_plane_actions.database_name, media_plane_actions.table_name) + assert NodeManager.node_obj.execute_command_on_node(media_plane_actions.node_alias, + ShellUtils.su(self.job_user, command)).status + + @pytest.fixture(scope="session") + def clean_job_stdout_files(self, media_plane_actions): + command = "cd %s && rm -rf %s" % (media_plane_actions.job_base_directory, media_plane_actions.job_stdout_file) + NodeManager.node_obj.execute_command_on_node(media_plane_actions.node_alias, + ShellUtils.su(self.job_user, command)) + + @pytest.fixture + def schedule_job(self, media_plane_actions, clean_table, clean_job_stdout_files): + assert media_plane_actions.schedule_15_min_job() + + def test_chaos_on_executor_kill(self, schedule_job): + exp_template_file = "automation/chaos/exp_templates/spark/executor_kill_exp.json" + context = {"job_name": self.job_alias} + chaos_user_actions.run_experiment(exp_template_file=exp_template_file, context=context) + + def test_chaos_on_driver(self, schedule_job): + exp_template_file = "automation/chaos/exp_templates/spark/driver_kill_exp.json" + context = {"job_name": self.job_alias} + chaos_user_actions.run_experiment(exp_template_file=exp_template_file, context=context) + + def test_chaos_on_driver_and_executor_kill(self, schedule_job): + exp_template_file = "automation/chaos/exp_templates/spark/driver_and_executor_kill_exp.json" + context = {"job_name": self.job_alias} + chaos_user_actions.run_experiment(exp_template_file=exp_template_file, context=context) + + def test_data_validation_post_chaos(self, user_actions, media_plane_actions): + user_actions.validate(media_plane_actions.validate_media_plane, self.job_alias) diff --git a/chaostoolkit_nimble/tests/sample/test_shell_app_exp.py b/chaostoolkit_nimble/tests/sample/test_shell_app_exp.py new file mode 100644 index 0000000..1f8cbed --- /dev/null +++ b/chaostoolkit_nimble/tests/sample/test_shell_app_exp.py @@ -0,0 +1,83 @@ +import logging + +import pytest +from nimble.core.entity.components import Components +from nimble.core.utils.dynamic_substitution_utils import DynamicSubstitutionUtils +from nimble.core.utils.multiprocessing_utils import MultiprocessingUtils + +from chaostoolkit_nimble.actions.base.flows import chaos_user_actions +from chaostoolkit_nimble.actions.sample import sample_application_actions + +_LOGGER = logging.getLogger(__name__) + + +class TestShellAppExp(): + @pytest.fixture(scope="session") + def multiprocessing_utils(self): + return MultiprocessingUtils(1) + + @pytest.fixture(scope="session") + def launch_application_on_remote(self, multiprocessing_utils): + # NodeManager.node_obj.execute_command_on_node("testautomation-mst-01", "nohup sleep 5m &") + process_list = multiprocessing_utils.run_method_in_parallel_async( + sample_application_actions.launch_application()) + yield + for process in process_list: + process.terminate() + + # def test_application_ha(self, launch_application_on_remote): + def test_application_ha(self): + # ha_utils.process_ha(Components.MANAGEMENT.name, "sleep 5m") + ################## Templating + # env = Environment(loader=PackageLoader("chaostoolkit_nimble", "resources/templates/shell_application/")) + # template = env.get_template('shell_app_exp_template.json') + # process_name = "sleep 5m" + # variables = {"component": Components.MANAGEMENT.name, + # "process_name": process_name, + # "expected_process_id": "", + # } + # # print(template.render(variables)) + # json_string = template.render(variables) + # experiment_file_path = "%s/process_experiment.json" % global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH + # response_obj = ShellUtils.execute_shell_command("chaos run %s" % (experiment_file_path)) + + ###############Dynamic Substitution################################################################################# + # experiment_template_file_path = "chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp.json" + # experiment_file_path = "%s/shell_app_exp.json" % global_constants.DEFAULT_LOCAL_ARTIFACTS_PATH + # ShellUtils.execute_shell_command("cp %s %s" % (experiment_template_file_path, experiment_file_path)) + # process_name = "sleep 5m" + # variables = {"rand_dynamic_component": Components.NAMENODE.name, + # "rand_dynamic_process_name": process_name, + # } + # # print(template.render(variables)) + # DynamicSubstitutionUtils.add(variables) + # DynamicSubstitutionUtils.update_file(experiment_file_path) + # response_obj = ShellUtils.execute_shell_command("chaos run %s" % experiment_file_path) + + ######################### User actions & Dynamic Substitution ########################################################### + experiments_template_path = "chaostoolkit_nimble/resources/exp_templates/shell_app/shell_app_exp.json" + variables = {"rand_dynamic_component": Components.NAMENODE.name, + "rand_dynamic_process_name": "sleep 5m", + } + DynamicSubstitutionUtils.add(variables) + chaos_user_actions.run_experiment(experiments_template_path) + +########################################################################################################## +# env = Environment(loader=PackageLoader("chaostoolkit_nimble", "resources/templates/shell_application")) +# template = env.get_template('process_experiment_template.json') +# variables = {"remote_command_var": "echo Hi", +# "remote_ip_var": "192.168.135.59", +# "remote_username_var": "root", +# "remote_password_var": "guavus@123", +# "remote_command_timeout_var": 120, +# "remote_connection_timeout_var": 120, +# "expected_remote_command_output": "Hi", +# "local_command_var": "ls", +# "local_command_arguments": "/tmp" +# } +# +# print(template.render(variables)) +# + + +# TestChaosOnApplocation().test_application_ha() diff --git a/tests/test_actions.py b/chaostoolkit_nimble/tests/test_actions.py similarity index 98% rename from tests/test_actions.py rename to chaostoolkit_nimble/tests/test_actions.py index 75e94e0..2b2574b 100644 --- a/tests/test_actions.py +++ b/chaostoolkit_nimble/tests/test_actions.py @@ -5,8 +5,8 @@ from chaoslib.exceptions import ActivityFailed from kubernetes.client.rest import ApiException -from chaosk8s.actions import start_microservice, kill_microservice -from chaosk8s.node.actions import cordon_node, create_node, delete_nodes, \ +from chaostoolkit_nimble.core.extensions import start_microservice, kill_microservice +from chaostoolkit_nimble.core.extensions.chaosk8s.node import cordon_node, create_node, delete_nodes, \ uncordon_node, drain_nodes diff --git a/tests/test_client.py b/chaostoolkit_nimble/tests/test_client.py similarity index 96% rename from tests/test_client.py rename to chaostoolkit_nimble/tests/test_client.py index 0ad8d4a..60b2f2b 100644 --- a/tests/test_client.py +++ b/chaostoolkit_nimble/tests/test_client.py @@ -2,7 +2,7 @@ import os from unittest.mock import MagicMock, patch -from chaosk8s import create_k8s_api_client +from chaostoolkit_nimble.core.extensions import create_k8s_api_client @patch('chaosk8s.has_local_config_file', autospec=True) diff --git a/tests/test_discovery.py b/chaostoolkit_nimble/tests/test_discovery.py similarity index 80% rename from tests/test_discovery.py rename to chaostoolkit_nimble/tests/test_discovery.py index 7d8006b..f3a9d8b 100644 --- a/tests/test_discovery.py +++ b/chaostoolkit_nimble/tests/test_discovery.py @@ -1,6 +1,6 @@ # -*- coding: utf-8 -*- -from chaosk8s import __version__, discover +from chaostoolkit_nimble.core.extensions import __version__, discover def test_discover_extension_capabilities(): diff --git a/tests/test_pod.py b/chaostoolkit_nimble/tests/test_pod.py similarity index 98% rename from tests/test_pod.py rename to chaostoolkit_nimble/tests/test_pod.py index a4507b3..646ff48 100644 --- a/tests/test_pod.py +++ b/chaostoolkit_nimble/tests/test_pod.py @@ -1,13 +1,11 @@ # -*- coding: utf-8 -*- from unittest.mock import MagicMock, patch, ANY, call -from kubernetes import client - import pytest from chaoslib.exceptions import ActivityFailed -from chaosk8s.pod.actions import terminate_pods -from chaosk8s.pod.probes import pods_in_phase, pods_not_in_phase, pods_in_conditions +from chaostoolkit_nimble.core.extensions.chaosk8s.pod import terminate_pods +from chaostoolkit_nimble.core.extensions.chaosk8s.pod import pods_in_phase, pods_not_in_phase, pods_in_conditions @patch('chaosk8s.has_local_config_file', autospec=True) diff --git a/tests/test_probes.py b/chaostoolkit_nimble/tests/test_probes.py similarity index 98% rename from tests/test_probes.py rename to chaostoolkit_nimble/tests/test_probes.py index 74887cc..9d2ce78 100644 --- a/tests/test_probes.py +++ b/chaostoolkit_nimble/tests/test_probes.py @@ -7,8 +7,8 @@ import urllib3 from chaoslib.exceptions import ActivityFailed -from chaosk8s.node.probes import get_nodes -from chaosk8s.probes import all_microservices_healthy, \ +from chaostoolkit_nimble.core.extensions.chaosk8s.node.probes import get_nodes +from chaostoolkit_nimble.core.extensions.chaosk8s.probes import all_microservices_healthy, \ microservice_available_and_healthy, microservice_is_not_available, \ service_endpoint_is_initialized, deployment_is_not_fully_available, \ deployment_is_fully_available, read_microservices_logs diff --git a/journal.json b/journal.json new file mode 100644 index 0000000..ef9c1fc --- /dev/null +++ b/journal.json @@ -0,0 +1,143 @@ +{ + "chaoslib-version": "1.5.0", + "platform": "Darwin-18.0.0-x86_64-i386-64bit", + "node": "kritika-Saxena.local", + "experiment": { + "version": "1.0.0", + "title": "Experiment with killing a process on system", + "description": "The process should auto-respawn after being killed.", + "tags": [ + "process" + ], + "controls": [ + { + "name": "process-related-controls", + "provider": { + "type": "python", + "module": "chaostoolkit_nimble.controllers.process.control" + } + } + ], + "steady-state-hypothesis": { + "title": "Process is up and running", + "probes": [ + { + "type": "probe", + "name": "Fetch-process-id", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "check_process_running", + "arguments": { + "component": "MANAGEMENT", + "process_name": "media_plane_microapp1" + } + } + } + ] + }, + "method": [ + { + "type": "action", + "name": "Kill-process", + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "kill_process", + "arguments": { + "component": "MANAGEMENT", + "process_name": "media_plane_microapp1", + "num_of_nodes": "1" + } + } + } + ], + "dry": false + }, + "start": "2019-08-05T13:42:10.450632", + "status": "completed", + "deviated": false, + "steady_states": { + "before": { + "steady_state_met": true, + "probes": [ + { + "activity": { + "type": "probe", + "name": "Fetch-process-id", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "check_process_running", + "arguments": { + "component": "MANAGEMENT", + "process_name": "media_plane_microapp1" + } + } + }, + "output": true, + "status": "succeeded", + "start": "2019-08-05T13:42:10.452512", + "end": "2019-08-05T13:42:10.742304", + "duration": 0.289792, + "tolerance_met": true + } + ] + }, + "after": { + "steady_state_met": true, + "probes": [ + { + "activity": { + "type": "probe", + "name": "Fetch-process-id", + "tolerance": true, + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "check_process_running", + "arguments": { + "component": "MANAGEMENT", + "process_name": "media_plane_microapp1" + } + } + }, + "output": true, + "status": "succeeded", + "start": "2019-08-05T13:44:13.856913", + "end": "2019-08-05T13:44:13.991240", + "duration": 0.134327, + "tolerance_met": true + } + ] + } + }, + "run": [ + { + "activity": { + "type": "action", + "name": "Kill-process", + "provider": { + "module": "chaostoolkit_nimble.core.utils.ha_utils", + "type": "python", + "func": "kill_process", + "arguments": { + "component": "MANAGEMENT", + "process_name": "media_plane_microapp1", + "num_of_nodes": "1" + } + } + }, + "output": "[{'stdout': 'kill -9 6112 6113', 'stderr': '', 'status_code': 0, 'node': '192.168.134.170', 'linux_user': None, 'kerberos_client_user': None, 'comments': None, 'object_diff_response': {'intersection_count': None, 'intersection': None, 'union_count': None, 'union': None, 'lhs_diff_count': None, 'lhs_diff': None, 'rhs_diff_count': None, 'rhs_diff': None, 'lhs_extra_count': None, 'lhs_extra': None, 'rhs_extra_count': None, 'rhs_extra': None, 'mismatches_count': None, 'mismatches': None, 'column_diff_count': None, 'column_diff': None, 'actual_file': None, 'expected_file': None}, 'healthcheck_response': {'test_case_id': None, 'test_description': None, 'failure_reason': [], 'resolution': []}, 'container_id': None}]", + "status": "succeeded", + "start": "2019-08-05T13:42:13.751279", + "end": "2019-08-05T13:42:13.853999", + "duration": 0.10272 + } + ], + "rollbacks": [], + "end": "2019-08-05T13:44:13.992637", + "duration": 123.55752611160278 +} \ No newline at end of file diff --git a/logging.cfg b/logging.cfg new file mode 100644 index 0000000..cbcb624 --- /dev/null +++ b/logging.cfg @@ -0,0 +1,28 @@ +[loggers] +keys=root + +[handlers] +keys=consoleHandler,fileHandler + +[formatters] +keys=simpleFormatter + +[logger_root] +level=INFO +handlers=fileHandler + +[handler_consoleHandler] +class=StreamHandler +level=DEBUG +formatter=simpleFormatter +args=(sys.stdout,) + +[handler_fileHandler] +class=FileHandler +level=DEBUG +formatter=simpleFormatter +args=('target/artifacts/execution.log', 'a+') + +[formatter_simpleFormatter] +format=%(asctime)s - [%(name)s::%(funcName)s()::%(lineno)s] - %(processName)s - %(levelname)s - %(message)s +datefmt= diff --git a/notes b/notes new file mode 100644 index 0000000..3c5401d --- /dev/null +++ b/notes @@ -0,0 +1,13 @@ + +-----------------------------------------------------------------------Nimble fixes/enhancements for chaos: +/Users/kritika.saxena/KR_VIRENV_CHAOS_NIMBLE_PY3_DEP/venv/lib/python3.7/site-packages/nimble/core/utils/shell_utils.py /us/bin/cp ---> cp + +return "%s | xargs --verbose -r %s" % (ShellUtils.fetch_process_id(process_name), ShellUtils.kill_process_by_id("")) + +execute_command_on node/ execute_remote_command / run_remote_command --> pty property add + +@staticmethod +def find_and_replace_whole_line_in_file(find, replace, input_file): + find = find.replace("/", "\\/") + replace = replace.replace("/", "\\/") + return "sed -i '/%s/c\%s' %s" % (find, replace, input_file) \ No newline at end of file diff --git a/pytest.ini b/pytest.ini index c34a995..a5e2dbb 100644 --- a/pytest.ini +++ b/pytest.ini @@ -1,3 +1,5 @@ [pytest] -norecursedirs=dist build htmlcov docs .eggs -addopts=-v -rxs --junitxml=junit-test-results.xml --cov=chaosk8s --cov-report term-missing:skip-covered --cov-report xml +norecursedirs=dist build htmlcov docs .eggs core target resources actions locators +#addopts=-v -rxs --junitxml=junit-test-results.xml --cov=chaosk8s --cov-report term-missing:skip-covered --cov-report xml +addopts = -s -v --html=report.html --self-contained-html --html=target/artifacts/report.html --alluredir=target/artifacts/allure/ +python_files = *.py diff --git a/requirements.txt b/requirements.txt index 1e6dc39..9cfbe33 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,10 @@ +nimble>=4.0.0 dateparser kubernetes logzero -chaostoolkit-lib>=0.20.0 +chaostoolkit-lib pyyaml +pytest-html +chaostoolkit +chaostoolkit-reporting +twine \ No newline at end of file diff --git a/setup.py b/setup.py index 031f31c..2f71647 100644 --- a/setup.py +++ b/setup.py @@ -1,88 +1,17 @@ -#!/usr/bin/env python -"""chaostoolkit builder and installer""" -import os -import sys -import io - -import setuptools - -def get_version_from_package() -> str: - """ - Read the package version from the source without importing it. - """ - path = os.path.join(os.path.dirname(__file__), "chaosk8s/__init__.py") - path = os.path.normpath(os.path.abspath(path)) - with open(path) as f: - for line in f: - if line.startswith("__version__"): - token, version = line.split(" = ", 1) - version = version.replace("'", "").strip() - return version - -name = 'chaostoolkit-kubernetes' -desc = 'Chaos Toolkit Kubernetes support' - -with io.open('README.md', encoding='utf-8') as strm: - long_desc = strm.read() - -classifiers = [ - 'Development Status :: 5 - Production/Stable', - 'Intended Audience :: Developers', - 'License :: Freely Distributable', - 'Operating System :: OS Independent', - 'License :: OSI Approved :: Apache Software License', - 'Programming Language :: Python', - 'Programming Language :: Python :: 3', - 'Programming Language :: Python :: 3.5', - 'Programming Language :: Python :: 3.6', - 'Programming Language :: Python :: 3.7', - 'Programming Language :: Python :: Implementation', - 'Programming Language :: Python :: Implementation :: CPython' -] -author = 'chaostoolkit Team' -author_email = 'contact@chaostoolkit.org' -url = 'http://chaostoolkit.org' -license = 'Apache License Version 2.0' -packages = [ - 'chaosk8s', - 'chaosk8s.node', - 'chaosk8s.pod' -] - -needs_pytest = set(['pytest', 'test']).intersection(sys.argv) -pytest_runner = ['pytest_runner'] if needs_pytest else [] -test_require = [] -with io.open('requirements-dev.txt') as f: - test_require = [l.strip() for l in f if not l.startswith('#')] - -install_require = [] -with io.open('requirements.txt') as f: - install_require = [l.strip() for l in f if not l.startswith('#')] - - -setup_params = dict( - name=name, - version=get_version_from_package(), - description=desc, - long_description=long_desc, - classifiers=classifiers, - author=author, - author_email=author_email, - url=url, - license=license, - packages=packages, - include_package_data=True, - install_requires=install_require, - tests_require=test_require, - setup_requires=pytest_runner, - python_requires='>=3.5.*' -) - - -def main(): - """Package installation entry point.""" - setuptools.setup(**setup_params) - - -if __name__ == '__main__': - main() +from setuptools import setup, find_packages + +if __name__ == "__main__": + with open("requirements.txt") as f: + requirements = list(filter(lambda x: not x.startswith("pytest"), f.read().splitlines())) + + setup( + name="chaostoolkit-nimble", + packages=find_packages(), + description="Guavus Chaos Test automation framework", + version="0.0.2", + install_requires=requirements, + url="https://github.com/kritika-saxena-guavus/chaos_eng_automation", + author="Core Automation Squad", + author_email="automation-squad@guavus.com", + include_package_data=True + )