The tsbs_load
executable can benchmark data ingestion
for all the implemented databases.
tsbs_load
uses YAML files to specify the configuration for
running the load benchmark.
The config file is separated in two top-level sections:
data-source:****
...
loader:
...
data-source
contains the configuration for where to read the data from (type: SIMULATOR
ortype: FILE
)- For
SIMULATOR
the configuration specifies the time range to be simulated, the use-case, scale and other properties that regard the data - For
FILE
the configuration only specifies the location of the pre-generated file withtsbs_generate_data
- For
loader
contains the configuration for the loading the data. Two sub-sections are important heredb-specific
andrunner
- The
db-specific
configuration varies depending of the target database and for TimescaleDB contains information about user, password, ssl mode, while for influx it contains information about backoff interval, replication factor etc. - The
runner
configuration specifies the number of concurrent workers to use, batch size, hashing and so on
- The
To generate an example configuration file for a specific database run
$ load config --target=<db-name> --data-source=[FILE|SIMULATOR]
specifying db-name to one of the implemented databases and data-source to FILE or SIMULATOR
The generated config file is saved in ./config.yaml
You can find sample YAML configuration files for TimescaleDB in the
sample-configs directory. Both single and multi-node examples are provided
for FILE
and SIMULATOR
modes.
When you run tsbs_generate_data
a simulator is created for
the selected use case and the simulated data points are serialized
to a file. tsbs_load
utilizes the same simulators but the
simulated points are directly piped to the worker clients that send batches
of data to the databases.
You can notice that the same properties you configure in the YAML file
are the same flags that you need to specify when running tsbs_generate_data
.
You can run tsbs_load
with
$ load load <db_name> --config=./path-to-config.yaml
Where <db_name>
is one of the implemented databases or you can run
$ load load --help
for a list of the available databases.
The generated yaml file with tsbs_load config
does not contain
information about what each of the properties represents. You can easily discover
more details about each property by running:
$ load load --help
This will list all the available flags configurable for all databases. These flags
include the flags for data-source
and loader.runner
. The --loader.runner.db-name
flag
corresponds to the property:
loader:
runner:
db-name: some-db
in the YAML config file. With the type, description, and default value next to the flag name as :
string, Name of database (default "benchmark")
Some of the properties are only valid for specific databases. These
properties go under the loader.db-specific
section. To view information
about them you can run:
$ load load <db_name> --help
For example for timescaledb, you can see the following:
$ load load timescaledb --help
...
--loader.db-specific.chunk-time
duration
Duration that each chunk should represent, e.g., 12h (default 12h0m0s)
...
- Each property has a default value, used if not otherwise overridden
- An entry in the config YAML file overrides the default value
- A flag passed at runtime overrides an entry in the YAML file