Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Cache Synchronization for Hybrid Caching in Multi-Node Environments #5517

Open
IbrahimMNada opened this issue Oct 14, 2024 · 8 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-caching-hybrid untriaged

Comments

@IbrahimMNada
Copy link

IbrahimMNada commented Oct 14, 2024

Background and motivation

In a multi-replica environment utilizing hybrid caching (in-memory and out-of-process), cache desynchronization between nodes can occur because there is no built-in mechanism to synchronize in-memory caches across nodes behind a load balancer. This results in inconsistent cache states, reducing the reliability of the system.

This proposal addresses the problem by introducing an event-driven mechanism to ensure cache synchronization across nodes.

Problem Context

Hybrid caching involves two main components:

  1. Out-of-process cache: This ensures a single source of truth, making cache invalidation simple and effective across nodes.
  2. In-memory cache: While useful for quick access, it poses challenges in multi-node environments due to the lack of cross-node communication when caches are reset.

When a cache is reset in one node, other nodes do not get notified, leading to cache desynchronization across the system.

Problem Statement

The current hybrid caching model does not offer a built-in mechanism to notify all nodes about an in-memory cache reset, resulting in inconsistent cache states between nodes in a multi-node environment.

API Proposal

Proposed Solution

Overview

Introduce a Publisher-Subscriber model using webhooks, event queues, or other notification mechanisms to propagate cache reset events to all nodes using the hybrid cache. This model will allow one node (the Publisher) to notify other nodes (Subscribers) when a cache reset happens, ensuring synchronization of the in-memory cache across all nodes.

Key Features

  1. Webhook-based/Callback mechanism: Each node registers as a Subscriber to receive notifications when cache resets happen. The node initiating the reset acts as the Publisher.

  2. Retry strategy: In case of a failure in notifying a node, the system retries the notification, ensuring robustness in cache synchronization.

  3. Multi-provider support: While webhooks are the default, the design allows support for other messaging systems like event queues, SignalR, etc.

API Changes

  1. Add a CacheResetNotification class:

    • Encapsulates the logic for broadcasting cache reset events to other nodes.
    public class CacheResetNotification
    {
        public void NotifyAllSubscribers(string cacheKey);
        public void RegisterSubscriber(IList<Uri> subscriberUris);
        public void UnregisterSubscriber(IList<Uri> subscriberUris);
    }

API Usage

services.AddHybridCache(options => 
{
    options.UsePublisherSubscriberModel()
           .AddWebhookSubscriber(uri => new IList<Uri> ( {Uri("https://node1/reset")}));
});

Alternative Designs

Polling Mechanism: This was dismissed due to inefficiency and increased load on the nodes.

Risks

We need to consider network flooding or over requesting between nodes , so we can define a sync period between nodes/keys count thresh hold or something like this

@rsalus
Copy link

rsalus commented Oct 15, 2024

This would be a nice feature, but I'd prefer to use something like Redis pub/sub directly assuming that's what we are using for the L2 cache.

@IbrahimMNada
Copy link
Author

IbrahimMNada commented Oct 16, 2024

But like this it would make every one coupled with Redis,

I mean we could have it as an optional provider , but we need some default with no extra setup that's why I suggested web-hooks

We could have multiple providers like but not limited to
1-redis pub/sub
2-Event buses
3-Event signal R hub

@IbrahimMNada
Copy link
Author

Hello , @mgravell

could you please give us your invaluable input on this ? to be honest I'm eager to help

@bielu
Copy link

bielu commented Oct 24, 2024

@IbrahimMNada I think @rsalus meant here the message system, which would could be done in multiple ways like additional cache record type in pernament storage and readed as messages, or done as real messaging pattern, or message queue patter. there is a lot ways how it can be approached.

Personally I dont think usage of webhooks is good approach here.

@IbrahimMNada
Copy link
Author

@bielu I agree with you web hooks might not be optimal here ,
but I think we need some loyalty free approach here , which will be independent from any third party libraries.

of course we ill have ad-ons like for example Redis pub/sub but we need some default for the ones whos not using. any Redis/Message queue in their applications.

and by going to a shared storage every time we wanted to read from the in memory cache this negates the value of having in memory cache.

so we might do a simple (file-system) based message broker will notify other node of a cache that need to be evicted

@bielu
Copy link

bielu commented Oct 25, 2024

and by going to a shared storage every time we wanted to read from the in memory cache this negates the value of having in memory cache. it is not what I suggested... I suggest having new value of sync instructions which is checked in background thread not reading from source every time. So making message broker by reusing source cache...

@rsalus
Copy link

rsalus commented Oct 25, 2024

and by going to a shared storage every time we wanted to read from the in memory cache this negates the value of having in memory cache.

so we might do a simple (file-system) based message broker will notify other node of a cache that need to be evicted

pub/sub is pretty darn efficient and is the standard solution for a variety of implementations. see FusionCache for example.

I doubt a file-system based broker would be feasible given the limitations imposed by IO. webhook might be doable but would be rather inefficient (not to mention the security implications).

marc had this to say in another thread.

@IbrahimMNada
Copy link
Author

IbrahimMNada commented Oct 27, 2024

Okay , lets agree on something here :

Regardless of the method of cache invalidation you will choose , it should be decupled from hybrid cache itself, and it should be extendable as you guys always do,

so I suggest for starter that we introduce events to notify any concerned parties :

what we could do is to introduce a new property in HybridCacheOptions Called Events and it contains three properties

OnGet => Action/Func<> returns (Key , CacheItem)
OnSet => Action/Func<> returns (Key , CacheItem)
OnDelete => Action/Func<> returns (Key)

and on the Startup.cs the consumer can do what he likes , Send an Integration event, Pub/Sub , MediatR notification options are endless......
and based on this we could easily build extensions to handle various providers.

if you think this is a good place to start from please tell me , I'm ready to implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-caching-hybrid untriaged
Projects
None yet
Development

No branches or pull requests

4 participants