Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memtier_benchmark doest not exit as expected #94

Open
ymjzxw opened this issue Oct 15, 2019 · 27 comments · May be fixed by #264
Open

memtier_benchmark doest not exit as expected #94

ymjzxw opened this issue Oct 15, 2019 · 27 comments · May be fixed by #264

Comments

@ymjzxw
Copy link

ymjzxw commented Oct 15, 2019

In my test enviroment, I got the following outputs:
$ ./memtier_benchmark -s 192.168.111.126 -p 2001 -n 1 -c 1 -t 2 -d 20000 --cluster-mode
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 0 secs] 2 threads: 2 ops, 6557 (avg: 6557) ops/sec, 125.38MB/sec (avg: 125.38MB/sec), 0.39 (avg: 0.39) msec latency

It will hang here like forever. By reading the code, it seems it should exit and output like this:
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 0 secs] 0 threads: 2 ops, 0 (avg: 6756) ops/sec, 0.00KB/sec (avg: 129.20MB/sec), 0.00 (avg: 0.37) msec latency

2 Threads
1 Connections per thread
1 Requests per client

ALL STATS

Type Ops/sec Hits/sec Misses/sec MOVED/sec ASK/sec Latency KB/sec

Sets 6779.66 --- --- 0.00 0.00 0.37200 132746.29
Gets 0.00 0.00 0.00 0.00 0.00 0.00000 0.00
Waits 0.00 --- --- --- --- 0.00000 ---
Totals 6779.66 0.00 0.00 0.00 0.00 0.37200 132746.29

Request Latency Distribution
Type <= msec Percent

SET 0.350 50.00
SET 0.390 100.00

Why? My test environment is cluster mode.

@vasiloeth
Copy link

Following up on this issue... any updates?
I'm facing the same problem

YaacovHazan added a commit to YaacovHazan/memtier_benchmark that referenced this issue Dec 11, 2019
add missing bufferevent_disable for connections that
complete to send new requests
yaacovhazan-Redislabs pushed a commit that referenced this issue Dec 11, 2019
add missing bufferevent_disable for connections that
complete to send new requests
@lc0305
Copy link

lc0305 commented Feb 13, 2021

Seems like the fix was removed from the mainline again. Any reason for this or was this by accident? #106

@YaacovHazan
Copy link
Collaborator

Hi, @lc0305 the line was removed when we handled "fix issue #128 zeros in ouptut".
Do you still encounter the above "memtier_benchmark doest not exit as expected #94"?

@lc0305
Copy link

lc0305 commented Feb 14, 2021

Yes it did not exit for me in cluster-mode when benchmarking against a remote host, but did exit when benchmarking against localhost.
I added the line from the previous fix and recompiled memtier and it fixed the issue for me.

@YaacovHazan
Copy link
Collaborator

With the same command-line arguments? if not, can you share please the command line that you used?

@lc0305
Copy link

lc0305 commented Feb 14, 2021

It did not exit with the default arguments except for --cluster-mode -s -p set to the remote IP address and port.
benchmark ran on macOS 11
cluster ran on Debian 10 (Kernel 4.19)

@YaacovHazan
Copy link
Collaborator

@lc0305 can you please, run it with --debug and share the output?

@Teddyhoang
Copy link

Get the same problem. Here the result with --debug
Capture

@YaacovHazan
Copy link
Collaborator

@Teddyhoang not sure I got it... it seems that in your example the test exit as expected, no?

@molian6
Copy link

molian6 commented Jun 29, 2023

Hi there, I'm running into the same problem with it didn't exit running cluster mode. It stuck at 100%. Any idea on what causes it and potential fixes?

Thanks!

@YaacovHazan
Copy link
Collaborator

@molian6, which version do you use? The new 2.0.0 version includes a fix around cluster mode: #205

@molian6
Copy link

molian6 commented Jun 29, 2023

I'm pulling the redislabs/memtier_benchmark:edge image, and looks like it's 2.0.0

docker run -it redislabs/memtier_benchmark:edge --version
memtier_benchmark 2.0.0
Copyright (C) 2011-2022 Redis Ltd.
This is free software.  You may redistribute copies of it under the terms of
the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
There is NO WARRANTY, to the extent permitted by law.

@YaacovHazan
Copy link
Collaborator

@molian6 can you run it with --debug and paste the output?

@molian6
Copy link

molian6 commented Jul 5, 2023

It's not stuck all the time, but about 30% of the runs. I managed to repro once with --debug. The command I ran is following, and I ran against a 3 shard redis cluster:

memtier_benchmark --server=<IP> --port=6379 --threads=10 --clients=100 --pipeline=1 --key-prefix=memtier- --key-minimum=1 --key-maximum=1000000 --cluster-mode --hide-histogram --debug --print-percentiles=50,90,95,99,99.9 --test-time=60 --ratio=1:3 --data-size=1024 --key-pattern=R:R

The behavior I observed is it hung there for 10+ minutes, but it would finally return with the correct stats report. I obfuscated the endpoint, but they are just the 3 node ip and port endpoint, but following it the last 100 logs I got for --debug

docker logs -n 100 ac002ff3ce32
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 449.40) msec latency
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 452.11) msec latency
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 453.21) msec latency
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 454.86) msec latency
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 455.96) msec latency
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): +OK, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 457.64) msec latency
shard_connection.cpp:442: server <endpoint B>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 458.76) msec latency
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 459.32) msec latency
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 461.02) msec latency
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 461.58) msec latency
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 465.59) msec latency
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 467.30) msec latency
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 469.06) msec latency
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 469.64) msec latency
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): +OK, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 471.40) msec latency
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 471.99) msec latency
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint C>: handled response (first line): $-1, 0 hits, 1 misses
shard_connection.cpp:508: <endpoint C> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint A>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 473.18) msec latency
shard_connection.cpp:508: <endpoint A> Done, no requests to send no response to wait for
shard_connection.cpp:442: server <endpoint B>: handled response (first line): $-1, 0 hits, 1 misses2MB/sec), 344820.22 (avg: 473.81) msec latency
shard_connection.cpp:508: <endpoint B> Done, no requests to send no response to wait for

@molian6
Copy link

molian6 commented Jul 5, 2023

Maybe it's also worth mention that this happened more frequently when server drop the connections. It looks like to maybe some error handling cases we've missed.

connection dropped.
connection dropped.
connection dropped.
connection dropped.
connection dropped.
connection dropped.
connection dropped.cs] 10 threads:        5299 ops,     365 (avg:     256) ops/sec, 365.78MB/sec (avg: 256.30MB/sec), 250.15 (avg: 540.09) msec latency
connection dropped.
connection dropped.cs] 10 threads:        5651 ops,     351 (avg:     260) ops/sec, 351.85MB/sec (avg: 260.71MB/sec), 259.79 (avg: 522.63) msec latency
connection dropped.
connection dropped.cs] 10 threads:        6044 ops,     392 (avg:     266) ops/sec, 392.84MB/sec (avg: 266.54MB/sec), 217.75 (avg: 502.81) msec latency
connection dropped.cs] 10 threads:        6829 ops,     382 (avg:     276) ops/sec, 382.90MB/sec (avg: 276.74MB/sec), 247.52 (avg: 473.57) msec latency
connection dropped.
connection dropped.cs] 10 threads:        7672 ops,     445 (avg:     287) ops/sec, 445.86MB/sec (avg: 287.59MB/sec), 222.60 (avg: 448.10) msec latency
connection dropped.cs] 10 threads:        8038 ops,     365 (avg:     290) ops/sec, 365.91MB/sec (avg: 290.42MB/sec), 224.05 (avg: 437.90) msec latency
connection dropped.cs] 10 threads:        8794 ops,     379 (avg:     296) ops/sec, 379.91MB/sec (avg: 296.31MB/sec), 252.20 (avg: 421.99) msec latency
connection dropped.cs] 10 threads:        9531 ops,     358 (avg:     300) ops/sec, 358.91MB/sec (avg: 300.87MB/sec), 238.30 (avg: 407.40) msec latency
connection dropped.cs] 10 threads:       10384 ops,     432 (avg:     308) ops/sec, 432.83MB/sec (avg: 308.32MB/sec), 293.98 (avg: 400.08) msec latency
connection dropped.cs] 10 threads:       11451 ops,     345 (avg:     312) ops/sec, 345.85MB/sec (avg: 312.18MB/sec), 286.60 (avg: 386.75) msec latency
connection dropped.ecs] 10 threads:       24087 ops,       0 (avg:     287) ops/sec, 0.00KB/sec (avg: 287.74MB/sec),  -nan (avg: 299.19) msec latencycyy
connection dropped.ecs] 10 threads:       24087 ops,       0 (avg:     245) ops/sec, 0.00KB/sec (avg: 245.14MB/sec),  -nan (avg: 299.19) msec latency

@YaacovHazan
Copy link
Collaborator

Thanks, @molian6. I tried to reproduce it with no success. I'll try to see if playing with the server connection helps.

YaacovHazan added a commit to YaacovHazan/memtier_benchmark that referenced this issue Jul 9, 2023
When a connection is dropped, it needs to empty its pipeline of pending
responses, so in case of reconnection, it won't hang waiting for
old requests.
YaacovHazan added a commit that referenced this issue Jul 9, 2023
@YaacovHazan
Copy link
Collaborator

Hi @molian6 see PR #230 , As you suggested, I did find an issue when a connection drops, and can cause memtier-benchmark to hang in cluster-mode.

@molian6
Copy link

molian6 commented Jul 10, 2023

Thanks for the quick fix! I'll patch the PR and verify if we see issues resolved.

@shailymittal
Copy link

Patched your changes but still getting the same issue

Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection reset by peer72080 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.44) msec latency
Connection error: Connection reset by peer
Connection error: Connection reset by peer
Connection error: Connection reset by peer
Connection error: Connection reset by peer72080 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.44) msec latency
Connection error: Connection reset by peer
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out
Connection error: Connection timed out157472084 ops, 670237 (avg: 524905) ops/sec, 190.70MB/sec (avg: 149.35MB/sec), 18.29 (avg: 23.46) msec latency
Connection error: Connection timed out
Connection error: Connection timed out

@YaacovHazan
Copy link
Collaborator

Hi @shailymittal can you provide more details? is it fails constantly? command line & output with debug level can be useful

@DrEsteban
Copy link
Contributor

DrEsteban commented Oct 5, 2024

I hit this today as well with a --cluster-mode test run. It runs for the configured 600 seconds but then starts increasing higher than 100%... But with 0 ops/sec 🤔


[RUN #1 98%, 589 secs] 32 threads:     3129265 ops,    4243 (avg:    5305) ops/sec, 4.32MB/sec (avg: 5.40MB/sec), 36083.80 (avg: 28052.76) msec lat
[RUN #1 98%, 590 secs] 32 threads:     3133541 ops,    4269 (avg:    5303) ops/sec, 4.35MB/sec (avg: 5.40MB/sec), 36128.91 (avg: 28063.78) msec lat
[RUN #1 99%, 591 secs] 32 threads:     3137789 ops,    4241 (avg:    5301) ops/sec, 4.32MB/sec (avg: 5.40MB/sec), 36080.57 (avg: 28074.64) msec lat
[RUN #1 99%, 592 secs] 32 threads:     3141958 ops,    4162 (avg:    5299) ops/sec, 4.24MB/sec (avg: 5.40MB/sec), 35982.31 (avg: 28085.13) msec lat
[RUN #1 99%, 593 secs] 32 threads:     3146309 ops,    4345 (avg:    5298) ops/sec, 4.43MB/sec (avg: 5.40MB/sec), 35908.45 (avg: 28095.95) msec lat
[RUN #1 99%, 594 secs] 32 threads:     3150566 ops,    4251 (avg:    5296) ops/sec, 4.33MB/sec (avg: 5.39MB/sec), 35868.74 (avg: 28106.45) msec lat
[RUN #1 99%, 595 secs] 32 threads:     3154675 ops,    4102 (avg:    5294) ops/sec, 4.18MB/sec (avg: 5.39MB/sec), 35880.93 (avg: 28116.58) msec lat
[RUN #1 99%, 596 secs] 32 threads:     3158981 ops,    4299 (avg:    5292) ops/sec, 4.38MB/sec (avg: 5.39MB/sec), 35941.38 (avg: 28127.24) msec lat
[RUN #1 100%, 597 secs] 32 threads:     3163353 ops,    4386 (avg:    5291) ops/sec, 4.47MB/sec (avg: 5.39MB/sec), 35933.73 (avg: 28138.03) msec la
[RUN #1 100%, 598 secs] 32 threads:     3167684 ops,    4443 (avg:    5289) ops/sec, 4.52MB/sec (avg: 5.39MB/sec), 35965.75 (avg: 28148.73) msec la
[RUN #1 100%, 599 secs] 32 threads:     3172102 ops,    4705 (avg:    5289) ops/sec, 4.79MB/sec (avg: 5.39MB/sec), 35853.18 (avg: 28159.46) msec la
[RUN #1 100%, 600 secs] 32 threads:     3176764 ops,    5177 (avg:    5288) ops/sec, 5.27MB/sec (avg: 5.39MB/sec), 35820.45 (avg: 28170.71) msec la
[RUN #1 100%, 601 secs] 32 threads:     3181659 ops,    5744 (avg:    5289) ops/sec, 5.85MB/sec (avg: 5.39MB/sec), 35728.27 (avg: 28182.33) msec la
[RUN #1 100%, 602 secs] 32 threads:     3186575 ops,    6142 (avg:    5290) ops/sec, 6.26MB/sec (avg: 5.39MB/sec), 35565.19 (avg: 28193.72) msec la
[RUN #1 101%, 603 secs] 32 threads:     3192127 ops,    7473 (avg:    5293) ops/sec, 7.61MB/sec (avg: 5.39MB/sec), 35537.61 (avg: 28206.50) msec la
[RUN #1 101%, 603 secs] 32 threads:     3198108 ops,    8743 (avg:    5297) ops/sec, 8.90MB/sec (avg: 5.39MB/sec), 35175.75 (avg: 28219.53) msec la
[RUN #1 101%, 604 secs] 32 threads:     3204242 ops,    9745 (avg:    5301) ops/sec, 9.92MB/sec (avg: 5.40MB/sec), 34670.27 (avg: 28231.88) msec la
[RUN #1 101%, 604 secs] 32 threads:     3211657 ops,   13026 (avg:    5309) ops/sec, 13.27MB/sec (avg: 5.41MB/sec), 33697.20 (avg: 28244.50) msec l
[RUN #1 101%, 605 secs] 32 threads:     3220451 ops,   17586 (avg:    5319) ops/sec, 17.91MB/sec (avg: 5.42MB/sec), 32579.34 (avg: 28256.33) msec l
[RUN #1 101%, 605 secs] 32 threads:     3232107 ops,   27590 (avg:    5334) ops/sec, 28.11MB/sec (avg: 5.43MB/sec), 30713.82 (avg: 28265.20) msec l
[RUN #1 101%, 606 secs] 32 threads:     3360321 ops,  504366 (avg:    5544) ops/sec, 513.61MB/sec (avg: 5.65MB/sec), 5200.80 (avg: 27385.17) msec l
[RUN #1 101%, 606 secs] 32 threads:     3498114 ops,  797708 (avg:    5769) ops/sec, 812.35MB/sec (avg: 5.88MB/sec), 2321.96 (avg: 26397.91) msec l
[RUN #1 101%, 606 secs] 32 threads:     3548337 ops, 1855164 (avg:    5852) ops/sec, 1.84GB/sec (avg: 5.96MB/sec), 21243.88 (avg: 26324.96) msec la
[RUN #1 101%, 606 secs] 32 threads:     3548337 ops,       0 (avg:    5852) ops/sec, 0.00KB/sec (avg: 5.96MB/sec),  -nan (avg: 26324.96) msec laten
[RUN #1 101%, 606 secs] 32 threads:     3548337 ops,       0 (avg:    5852) ops/sec, 0.00KB/sec (avg: 5.96MB/sec),  -nan (avg: 26324.96) msec laten
[RUN #1 101%, 606 secs] 32 threads:     3548337 ops,       0 (avg:    5852) ops/sec, 0.00KB/sec (avg: 5.96MB/sec),  -nan (avg: 26324.96) msec laten
[RUN #1 101%, 606 secs] 32 threads:     3548337 ops,       0 (avg:    5852) ops/sec, 0.00KB/sec (avg: 5.96MB/sec),  -nan (avg: 26324.96) msec laten
[RUN #1 101%, 606 secs] 32 threads:     3548337 ops,       0 (avg:    5852) ops/sec, 0.00KB/sec (avg: 5.96MB/sec),  -nan (avg: 26324.96) msec laten[

(...continues higher than 101%)

I'll see if I can repro with --debug

@DrEsteban
Copy link
Contributor

DrEsteban commented Oct 5, 2024

What's also annoying is there's no way to get memtier to exit without losing all my results when it gets into this state... (I'm outputting to json file.) Even if I try to let it gracefully close with SIGTERM, it won't write the results it gathered to file so the whole run is lost 😢

@YaacovHazan
Copy link
Collaborator

@DrEsteban Thanks for raising it. On which version did you run the test? Can you provide the other arguments in the command line?
And as you said, repro with --debug, would be great.

Regarding your second comment. We have already seen in the past a request to dump the results also in case the test failed, I think we can combine it with yours

@DrEsteban
Copy link
Contributor

# memtier_benchmark --version
memtier_benchmark 2.1.0

I'm running the test against an Azure Enterprise Redis cluster with:

memtier_benchmark --server=<server> --port=10000 --authenticate=<key> --hide-histogram --pipeline=10 --clients=256 --threads=32 --data-size=1024 --test-time=600 --key-maximum=3041187 --key-pattern=R:R --ratio=0:1 --distinct-client-seed --run-count=1 --json-out-file=<outfile> --cluster-mode

Haven't had a chance to get the --debug output yet :( But I'll post as soon as I do!

@DrEsteban
Copy link
Contributor

@YaacovHazan I have a debug log for you! But it's 77MB ☹️ What's the best way to get it to you?

@YaacovHazan
Copy link
Collaborator

@DrEsteban thanks, do you want to try via mail / shared google drive?

@DrEsteban
Copy link
Contributor

DrEsteban commented Oct 29, 2024

@YaacovHazan I shared these with you via Slack about a week ago, just FYI!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants