Skip to content

Latest commit

 

History

History
 
 

vote

Noria micro-benchmark: vote

This benchmark measure the performance of a number of database-like backends on the following SQL query in various forms:

SELECT a.id, a.title, vc.votes
FROM a JOIN (
 SELECT id, COUNT(user) AS votes FROM Vote GROUP BY id
) AS vc ON (a.id = vc.id)

It is the same benchmark that was used in §8.2 in the OSDI'18 Noria paper.

The benchmark implements the above query in MySQL with pre-computed vote counts (vote mysql), the same with a memcached cache (vote hybrid), Microsoft SQL Server with materialized views (vote mssql), a memcached-only deployment (vote memcached), Noria with an embedded server (vote localsoup), and Noria with a remote server ([vote netsoup]). If you are just benchmarking Noria, you will likely want the localsoup backend.

The vote binary takes two sets of arguments, one to configure the workload and one to configure the chosen backend. For example, to run localsoup:

$ cargo b --release --bin vote
$ target/release/vote \
   --target 100000 --write-every 20 --threads 16 \
   localsoup \
   --shards 2

Here, we're running vote with a target load of 100k requests per second, where one in every 20 requests is a write (so a 95/5 read mix). We run 16 load generation threads, which should be plenty (NOTE: this argument will go away soon—it's annoying). We choose the localsoup backend, and configure it to run with 2 shards (to increase write throughput). See vote --help and vote <backend> --help for what other arguments and backends are available.

When vote finishes, it prints the latencies it observed during execution:

# generated ops/s: 16673.66
# actual ops/s: 16683.94
# op   pct  sojourn  remote
write  50   1903     1271    (all µs)
read   50   847      311     (all µs)
write  95   4623     4039    (all µs)
read   95   1303     383     (all µs)
write  99   6535     5967    (all µs)
read   99   1879     1391    (all µs)
write  100  25343    24327   (all µs)
read   100  13567    12551   (all µs)

The "generated ops/s" value indicates how many operations were enqueued for processing per second on average throughout the experiment. This number should be close to the given --target value. The "actual ops/s" is how many requests were retired/processed by the client threads per second (again, on average). Each row is a particular point in the latency histogram for reads or writes; by default the 50th (median), 95th, 99th, and 100th (max) percentiles are shown. You can get the full histogram with --histogram. The time shown in the remote column is the time from when a request was issued by a client until the backend response came back. The time in the sojourn column is the time from when a request was enqueued by the load generator until the request was satisfied. If the backend is not keeping up, sojourn time will increase as the queue grows. See Open Versus Closed: A Cautionary Tale for further details about this approach.

In general, to benchmark any backend, start with a low target throughput, and increase it until the sojourn time skyrockets. Just below that is the maximal supported throughput on your machine. You can find the throughput and latency we observed on an Amazon EC2 VM (c5.4xlarge in particular) in §8.2 of the OSDI'18 paper.