Regression benchmarks setup (hasura#3310)

* Benchmark GraphQL queries using wrk * fix console assets dir * Store wrk parameters as well * Add details about storing results in Readme * Remove files in bench-wrk while computing server shasum * Instead of just getting maximum throughput per query per version, create plots using wrk2 for a given set of requests per second. The maximum throughput is used to see what values of requests per second are feasible. * Add id for version dropdown * Allow specifiying env and args for GraphQL Engine 1) Arguments defined after -- will be applied as arguments to Hasura GraphQL Engine 2) Script will also pass the environmental variables to Hasura GraphQL Engine instances Hasura GraphQL engine can be run with the given environmental variables and arguments as follows $ export HASURA_GRAPHQL_...=.... $ python3 hge_wrk_bench.py -- --hge_arg1 val1 --hge_arg2 val2 ... * Use matplotlib instead of plotly for figures * Show throughput graph also. It maybe useful in checking performance regression across versions * Support storing results in s3 Use --upload-root-uri 's3://bucket/path' to upload results inside the given path.When specified, the results will be uploaded to the bucket, including latencies, latency histogram, and the test setup info. The s3 credentials should be provided as given in AWS boto3 documentation. * Allow specifying a name for the test scenario * Fix open latency uri bug * Update wrk docker image * Keep ylim a little higher than maximum so that the throughput plot is clearly visible * Show throughput plots for multiple queries at the same time * 1) Adjust size of dropdowns 2) Make label for requests/sec invisible when plot type is throughput * 1) Adding boto3 to requirements.txt 2) Removing CPU Key print line 3) Adding info about the tests that will be run with wrk2 * Docker builder fo wrk-websocket-server * Make it optional to setup remote graphql-engine * Listen on all interfaces and enable ping thread * Add bench_scripts to wrk-websocket-server docker * Use 127.0.0.1 instead of 'localhost' to address local hge For some reason it seems wrk was hanging trying to resolve 'localhost'. ping was able to fine from the same container, so I'm not sure what the deal was. Probably some local misconfiguration on my machine, but maybe this change will also help others. * Store latency samples in subdirectory, server_shasum just once at start, additional docs * Add a note on running the benchmarks in the simplest way * Add a new section on how to run benchmarks on a new linux hosted instance Co-authored-by: Nizar Malangadan <[email protected]> Co-authored-by: Brandon Simmons <[email protected]> Co-authored-by: Karthikeyan Chinnakonda <[email protected]> Co-authored-by: Brandon Simmons <[email protected]> Co-authored-by: Vamshi Surabhi <[email protected]>
tirumaraiselvan · Jun 19, 2020 · f8a7312 · f8a7312
1 parent 39688af
commit f8a7312
Show file tree

Hide file tree

Showing 36 changed files with 3,656 additions and 0 deletions.
diff --git a/server/bench-wrk/.gitignore b/server/bench-wrk/.gitignore
@@ -0,0 +1,5 @@
+__pycache__
+test_output
+.previous_work_dir
+.#*
+venv
diff --git a/server/bench-wrk/.python-version b/server/bench-wrk/.python-version
@@ -0,0 +1 @@
+3.7.6
diff --git a/server/bench-wrk/Readme.md b/server/bench-wrk/Readme.md
@@ -0,0 +1,139 @@
+## Benchmarking Hasura GraphQL Engine ##
+
+The script `hge_wrk_bench.py` helps in benchmarking the given version of Hasura
+GraphQL Engine using a set of GraphQL queries. The results are stored (into the
+*results GraphQL engine*) along with details like the version of GraphQL engine
+against which the benchmark is run, the version of Postgres database etc. The
+stored results can help in comparing benchmarks of different versions of
+GraphQL engine.
+
+### Setup ###
+
+The setup includes two Postgres databases with
+[sportsdb](https://www.thesportsdb.com/) schema and data, and two GraphQL
+engines running on the Postgres databases. Then one of the GraphQL engines is
+added as a remote schema to another GraphQL engine.
+
+The data will be same in both the databases. The tables reside in different
+database schema in-order to avoid GraphQL schema conflicts.
+
+The methods in script `sportsdb_setup.py` helps in setting up the databases,
+starting the Hasura GraphQL engines, and setting up relationships. This script
+can either take urls of already running Postgres databases as input, or it can
+start the databases as Docker instances. The GraphQL engines can be run either
+with `cabal run` or as Docker containers.
+
+### Run benchmark ###
+- Install Python 3.7.6 using pyenv
+```sh
+$ pyenv install 3.7.6
+```
+- Install dependencies for the Python script in a virtual environment.
+```sh
+$ python3 -m venv venv
+$ source venv/bin/activate
+$ pip3 install -r requirements.txt
+```
+- To run benchmarks, do
+```sh
+$ python3 hge_wrk_bench.py
+```
+This script uses [wrk](https://github.com/wg/wrk) to benchmark Hasura GraphQL
+Engine against a list of queries defined in `queries.graphql`. The results are
+then stored through a results Hasura GraphQL Engine.
+
+You can configure the build and runtime parameters for the graphql-engine's
+under test by modifying your local `cabal.project.local` file.
+
+### Interpreting the plots
+
+For each query under test we first run `wrk` to try to determine the maximum
+throughput we can sustain for that query. This result is plotted under the `max
+throughput` graph. This can be considered the point after which graphql-engine
+will start to fall over.
+
+Then for each query we measure latency under several different loads (but
+making sure not to approach max throughput) using `wrk2` which measures latency
+in a principled way. Latency can be viewed as a continuous histogram or as a
+violin plot that also plots each latency sample. The latter provides the most
+visual information and can be useful for observing clustering or other
+patterns, or validating the benchmark run.
+
+### Cleaning up test runs
+
+Data will be stored locally in the work directory (`test_output` by default).
+This entire directory can be deleted safely.
+
+If you are using the default results graphql-engine and want to just remove old
+benchmark runs but avoid rebuilding the sportsdb data, you can do:
+
+```
+$ sudo rm -r test_output/{benchmark_runs,sportsdb_data}
+```
+
+### Arguments ###
+- For the list of arguments supported, do
+```sh
+$ python3 hge_wrk_bench.py --help
+```
+
+#### Postgres ####
+  - In order to use already runnning Postgres databases, use argument `--pg-urls PG_URL,REMOTE_PG_URL`, or environmental variable `export HASURA_BENCH_PG_URLS=PG_URL,REMOTE_PG_URL`
+  - Set the docker image using argument `--pg-docker-image DOCKER_IMAGE`, or environmental variable `HASURA_BENCH_PG_DOCKER_IMAGE`
+
+#### GraphQL Engine ####
+  - Inorder to run as a docker container, use argument `--hge-docker-image DOCKER_IMAGE`, or environmental variable `HASURA_BENCH_HGE_DOCKER_IMAGE`
+  - To skip stack build, use argument `--skip-stack-build`
+
+#### wrk ####
+  - Number of open connections can be set using argument `--connections CONNECTIONS`, or environmental variable `HASURA_BENCH_CONNECTIONS`
+  - Duration of tests can be controlled using argument `--duration DURATION`, or environmental variable `HASURA_BENCH_CONNECTIONS`
+  - If plots should not have to be shown at the end of benchmarks, use argument `--skip-plots`
+  - The Hasura GraphQL Engine to which resuls should be pushed can be specified using argument
+    `--results-hge-url HGE_URL`, or environmental variable `HASURA_BENCH_RESULTS_HGE_URL`. By
+    default the launched (non-"remote") graphql-engine will be used, and its data stored in
+    `test_output/sportsdb_data`. The admin secret for this GraphQL engine can be specified
+    using environmental variable `HASURA_BENCH_RESULTS_HGE_ADMIN_SECRET`.
+
+### Work directory ###
+- The files used by Postgres docker containers, logs of Hasura GraphQL engines run with `cabal run`, and other stuff are stored in the work directory.
+- Storing data volumes of Postgres docker containers in the work directory (`test_output` by default) helps in avoiding database setup time for benchmarks after the first time setup.
+- The logs of Hasura GraphQL engines (when they are run using `cabal run`) are stored in files *hge.log* and *remote\_hge.log*
+
+### Default settings ###
+- Postgres databases will be run as docker containers
+- Hasura GraphQL Engines by default will be run using `cabal run`
+- With wrk
+  - Number of threads used by *wrk* will be number of CPUs
+  - Number of connections = 50
+  - Test duration = 5 minutes (300 sec)
+- By default the results are stored in the Hasura GraphQL Engine used for benchmarking.
+
+### Storing results ###
+- The results are stored in schema `hge_bench`.
+- For schema, see file `results_schema.yaml`
+- The main table is `hge_bench.results`. This table stores the following details
+  -  *cpu_key*: This is a foreign key reference to *cpu_info(key)*. The table *cpu_info* captures the various parameters of the CPU inwhich the benchmark was run, including the model and number of vCPUS
+  - *query_name*: This is a forieng key reference to *gql_query(name)*. The table *gql_query* stores the name of the query and the query itself used for tests.
+  - *docker_image*: Stores the docker images of Hasura GraphQL Engine when the HGE is run as docker
+  - *server_shasum*, *version*: These are stored when HGE is run with `cabal run`. Version stores the version generated by script *gen-version.sh*. The *server_shasum* stores the shasum of the files in the server folder (excluding tests folder). This shasum shows whether the server code has actually varied between the commits.
+  - *postgres_version* : Stores the version of Postgres
+  - *latency*, *requests_per_sec*: Stores the benchmark latency and requests\_per\_sec results
+  - *wrk_parameters*: Stores the parameters used by wrk during benchmarking, including number of threads, total number of open connections, and duration of tests
+
+### The simplest way to setup the benchmark  ###
+- Note: This method currently only works on linux instances
+- run the benchmarks on a docker-image using
+```
+python3 hge_wrk_bench.py --hge-docker-image DOCKER_IMAGE
+```
+- The command will prompt for a ``WORK_DIR`` which will store all the results,volumes and databases.
+- To compare the results, with another docker build, run the same command again with the modified ``DOCKER_IMAGE`` and the same ``WORK_DIR``
+- If the catalog versions of the two docker builds are not the same, run the benchmarks first on the docker image with a lower
+  catalog version and then run the benchmarks on the docker image with the higher catalog version.
+
+### Steps to run benchmarks on a new linux hosted instance ###
+- Install docker,python3
+- optional: install ghcup (cabal and ghc will be installed with it), you'll need cabal to be setup only when
+  you want to run the benchmarks on a branch directly (i.e. there's no docker image for it).
+- Run the benchmarks following the steps in the ``The simplest way to setup the benchmark``
diff --git a/server/bench-wrk/gen-version.sh b/server/bench-wrk/gen-version.sh
@@ -0,0 +1,46 @@
+#!/usr/bin/env sh
+set -e
+
+get_changes_hash()
+{
+    # We use this to determine if there are any new additions
+    local GIT_STATUS="$(git status --porcelain)"
+    # To determine if there are any changes
+    local GIT_DIFF_INDEX="$(git diff-index -p HEAD --)"
+    # Whether anything changed in the repo
+    export GIT_DIRTY="$GIT_STATUS$GIT_DIFF_INDEX"
+    if [ -n "$GIT_DIRTY" ]; then
+        DIRTY_HASH_SHORT="$(echo $GIT_DIRTY | sha256sum | awk '{print $1}' | tail -c 9)"
+        echo -dirty-$DIRTY_HASH_SHORT
+    else
+        echo ''
+    fi
+}
+
+get_main_version()
+{
+    # Get the branch name
+    local GIT_BRANCH="$(git rev-parse --abbrev-ref HEAD)"
+    # Get the current commit id
+    local COMMIT_HASH_SHORT="$(git rev-parse --short HEAD)"
+
+    case "$GIT_BRANCH" in
+
+        # The master branch
+        'master')
+            echo $COMMIT_HASH_SHORT;;
+
+        # The release branches
+        release-*)
+            local RELEASE_VER="$(git describe --match "v[0-9]*" HEAD 2>/dev/null)"
+            test -n "$RELEASE_VER" ||
+                RELEASE_VER="$(expr "$GIT_BRANCH" : release-*'\(.*\)')"-$COMMIT_HASH_SHORT
+            echo $RELEASE_VER;;
+
+        # Everything else
+        *)
+            echo $GIT_BRANCH-$COMMIT_HASH_SHORT;;
+    esac
+}
+
+echo "$(get_main_version)$(get_changes_hash)"
diff --git a/server/bench-wrk/get-server-sha.sh b/server/bench-wrk/get-server-sha.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+# We are generating shasum of files in the server directory (excluding packaging and test files).
+# This is to track whether server directory has changed or not
+dir=$(dirname $0)
+cd $dir/../..
+git ls-files -- ':!tests-py' . ':!packaging' . ':!.*' . ':!bench-wrk' | sort | xargs cat | shasum | awk '{print $1}' | tail -c 9