Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing keyspace notifications after master failover #3543

Open
ManelCoutinhoSensei opened this issue Mar 5, 2025 · 3 comments
Open

Missing keyspace notifications after master failover #3543

ManelCoutinhoSensei opened this issue Mar 5, 2025 · 3 comments

Comments

@ManelCoutinhoSensei
Copy link

Expected behavior

When a Redis master fails over in a Sentinel-managed cluster, the client should continue receiving keyspace notifications from the new master without requiring any additional configuration.

Actual behavior

Keyspace notifications from the new master are lost after master fails over.

Reproduce behavior

  1. Create a Redis Sentinel cluster using something like the following docker-compose.yml:
services:

  # Redis Node #1 (initial master) + Sentinel
  redis-node-1:
    container_name: redis_node_1
    image: bitnami/redis:7.2.4
    environment:
      - ALLOW_EMPTY_PASSWORD=yes
      - REDIS_AOF_ENABLED=no
    ports:
      - 6380:6379
    networks:
      - nw

  redis-node-1-sentinel:
    container_name: redis-node-1-sentinel
    image: bitnami/redis-sentinel:7.2.4
    depends_on:
      - redis-node-1
    environment:
      - REDIS_MASTER_HOST=redis-node-1
      - REDIS_SENTINEL_MASTER_NAME=mymaster
      - REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS=5000
      - REDIS_SENTINEL_FAILOVER_TIMEOUT=10000
      - REDIS_SENTINEL_QUORUM=2
    ports:
      - 36380:26379
    networks:
      - nw

  # Redis Node #2  + Sentinel  
  redis-node-2:
    container_name: redis_node_2
    image: bitnami/redis:7.2.4
    environment:
      - ALLOW_EMPTY_PASSWORD=yes
      - REDIS_REPLICATION_MODE=slave
      - REDIS_MASTER_HOST=redis-node-1
      - REDIS_AOF_ENABLED=no
    ports:
      - 6381:6379
    networks:
      - nw

  redis-node-2-sentinel:
    container_name: redis_node_2_sentinel
    image: bitnami/redis-sentinel:7.2.4
    depends_on:
      - redis-node-2
    environment:
      - REDIS_MASTER_HOST=redis-node-1
      - REDIS_SENTINEL_MASTER_NAME=mymaster
      - REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS=5000
      - REDIS_SENTINEL_FAILOVER_TIMEOUT=10000
      - REDIS_SENTINEL_QUORUM=2
    ports:
      - 36381:26379
    networks:
      - nw

  # Redis Node #3 + Sentinel 
  redis-node-3:
    container_name: redis_node_3
    image: bitnami/redis:7.2.4
    environment:
      - ALLOW_EMPTY_PASSWORD=yes
      - REDIS_REPLICATION_MODE=slave
      - REDIS_MASTER_HOST=redis-node-1
      - REDIS_AOF_ENABLED=no
    ports:
      - 6382:6379
    networks:
      - nw

  redis-node-3-sentinel:
    container_name: redis_node_3_sentinel
    image: bitnami/redis-sentinel:7.2.4
    depends_on:
      - redis-node-3
    environment:
      - REDIS_MASTER_HOST=redis-node-1
      - REDIS_SENTINEL_MASTER_NAME=mymaster
      - REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS=5000
      - REDIS_SENTINEL_FAILOVER_TIMEOUT=10000
      - REDIS_SENTINEL_QUORUM=2
    ports:
      - 36382:26379
    networks:
      - nw

networks:
  nw:
    driver: bridge
  1. Run this Python script:
sentinel = Sentinel(
    [("localhost", 36380), ("localhost", 36381), ("localhost", 36382)],
    min_other_sentinels=2,
    encoding="utf-8",
    decode_responses=True,
    socket_keepalive=True,
    socket_timeout=1,
    socket_connect_timeout=1,
    health_check_interval=5,
    retry=Retry(ExponentialBackoff(10, 0.5), 5),
)

redis_conn = sentinel.master_for("mymaster")
redis_conn.config_set("notify-keyspace-events", "KEA")

redis_pub_sub = redis_conn.pubsub()

print("Shutdown master in your prefered way")
client = docker.from_env()
for el in client.containers.list(all=True):
    if "node_1" in el.name:
        el.stop()
        el.wait()

print("Psubscribing")
def my_handler(message):
    print(message)
redis_pub_sub.psubscribe(**{"__keyspace@0__:topic:*": my_handler})
pubsub_thread = redis_pub_sub.run_in_thread(sleep_time=0.01)

# If this second config_set is commented out, notifications will not appear.
#redis_conn.config_set("notify-keyspace-events", "KEA")

pipe = redis_conn.pipeline(transaction=True)
for i in range(0, 100):
    pipe.hset('topic:id', mapping={'age': str(i)})
    pipe.expire('topic:id', 5)
    pipe.execute()
    time.sleep(0.1)

pubsub_thread.stop()

Note that no notifications are printed after the failover unless the second config_set("notify-keyspace-events", "KEA") command (currently commented out) is re-executed.

Additional Comments

A similar issue was reported and fixed in other libraries, such as Redisson (1, 2).

@vladvildanov
Copy link
Collaborator

@ManelCoutinhoSensei Hi! We will have a look on it in near time

@petyaslavova
Copy link
Collaborator

Hi @ManelCoutinhoSensei,

The issue is that when you set the configuration, it is applied only to the current master, and Redis does not replicate configuration settings to replicas.

So, when the master fails and a new master is promoted, the new master does not inherit the previous configuration, which means no messages are sent to the Pub/Sub channel.

One possible solution in your code is to retrieve all replicas and apply the same configuration before failover to ensure consistency.

I'm not sure whether the client should handle this differently from the server.
@vladvildanov, what would be the correct client behavior in this scenario?

@ManelCoutinhoSensei
Copy link
Author

Hi @petyaslavova,

I agree that this is the reason for the issue, but I'm not entirely sure whether this is what should happen.

As I mentioned, other libraries have encountered the same problem and implemented fixes, but let's consider a simple example with 2 replicas where the user implements your solution and replicates the configuration.

What happens when Redis A fails and Redis B takes over? We are faced with one of the following situations:
a) Hoping that if another failure occurs, it affects A again—because if B fails instead, there will be no properly configured Redis instance.
b) Needing to detect the failover and manually refresh the configuration, which undermines the goal of having a seamless, transparent failover process.
c) Continuously refreshing the configuration on all instances to keep them up to date—an approach that is inefficient and undesirable.

For these reasons, unless I'm missing smt, I believe a Redisson-like approach would be a more suitable solution.

But let's see what @vladvildanov says.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants