You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[BUG] socket_timeout parameter has no effect if the link is broken between redis instance and the client. Outage towards a slot range for several minutes.
#579
Describe the bug
socket_timeout parameter has no effect if the link is broken between redis instance and the client. Outage towards a slot range for several minutes.
To Reproduce
Async client is used with cluster mode. Redis cluster is used with 3 masters and 3 slaves.
generate continuous traffic
during the traffic, select a master redis instance, apply the following iptables rule in the container/vm of that server:
iptables -A OUTPUT -p tcp --sport redis_port -s redis_ip -j DROP
kill that redis server (kill -9 redis_server_pid) [-> new master election will happen for that slot range]
Expected behavior
After socket_timeout, redis-plus-plus discover the new elected master / broken connection, traffic is redirected to that master
Unexpected result: old connection is used, continuous TCP packet retransmissions, no response to users for the given slot range for several minutes
Environment:
OS: Rocky Linux 8.2-20.el8.0.1
Compiler: gcc version 8.5.0
hiredis version: hiredis 1.2.0
redis-plus-plus version: 1.3.12
Additional context
Correction proposal:
introduce a new property per connection: last_response_received
failure detection: whenever a request is sent in a connection, check if (t_now - last_response_received) is under socket_timeout + TOLERATION (some milliseconds).
when the failure is detected, be careful with CLUSTER SLOTS, do not target the problematic master instance, select another (I see issues like that)
check if mastership was changed, and reconnect to the new master if needed. Abort the ongoing requests towards the redis-plus-plus user, redis-plus-plus user should retransmit the request, maybe with some delay.
As a workaround, we started guard timer and reset the AsyncRedisClient at timeouts to force redis-plus-plus to discover the mastership changes / broken links. But this solution also caused some issues, see: #577 #578
The text was updated successfully, but these errors were encountered:
Describe the bug
socket_timeout parameter has no effect if the link is broken between redis instance and the client. Outage towards a slot range for several minutes.
To Reproduce
iptables -A OUTPUT -p tcp --sport redis_port -s redis_ip -j DROP
Expected behavior
After socket_timeout, redis-plus-plus discover the new elected master / broken connection, traffic is redirected to that master
Unexpected result: old connection is used, continuous TCP packet retransmissions, no response to users for the given slot range for several minutes
Environment:
OS: Rocky Linux 8.2-20.el8.0.1
Compiler: gcc version 8.5.0
hiredis version: hiredis 1.2.0
redis-plus-plus version: 1.3.12
Additional context
Correction proposal:
As a workaround, we started guard timer and reset the AsyncRedisClient at timeouts to force redis-plus-plus to discover the mastership changes / broken links. But this solution also caused some issues, see:
#577
#578
The text was updated successfully, but these errors were encountered: