Skip to content

How it works

Esta Nagy edited this page Jan 19, 2021 · 2 revisions

Caching flow in general

The flow of using a cache is rather simple when caching itself is easy to achieve. We insert a caching layer between our service code and the dependency (the origin of the data) and let the caching code to try the cache first. Depending on the use case, and how well we understood and implemented our cache, we will either see cache MISS cases or cache HIT cases for most of the calls.

Cache MISS

The cache MISS flow is the more complicated as this case will make sure the cache is populated with fresh data in addition to returning it from the origin.

Cache MISS

Cache HIT

At the same time, the cache HIT flow is more simple and less expensive since we can avoid calling the origin.

Cache HIT

As you can imagine, our goal would be to have more HIT cases than MISS cases as we don't do caching just for the sake of caching, we expect some benefit in return. This benefit can be:

  • cheaper operation,
  • less load on the origin service,
  • faster responses
  • or more reliable service (since we are less dependent on the origin).

Cache-Only

In case of Cache-Only, we are not using the same approach so it will need us to adjust slightly, but ultimately we are using the same building blocks. We are talking about bulk requests and bulk responses, meaning, that a single request and response pair contains information related to multiple (possibly all) entities we will need for a single service invocation instead of calling the origin service N times. This is introducing a number of new things we need to address:

  • We need to know the partition size (the maximum N value which can be accepted by the origin in a single request)
  • We must decide how we want to react to a certain situations (e.g., if we see a MISS)
  • We need to know how to split and merge requests and responses between the bulk and single request cases.
  • We can have new cases (having partial HIT or partial MISS depending on your preference)

The partial MISS case can look like the following diagram

Cache-Only partial MISS

As you can see, we are looping quite heavily, let's see why...

What will Cache-Only do?

While processing a request, Cache-Only should perform the following steps.

  1. Split the request into small parts (containing a request for a single entity for example)
  2. It will try the cache with each of these (depending on our strategy)
  3. After evaluating which parts are found and which are missing, we need to form bulk requests just below the partition size to let us get the missing parts from the origin
  4. Call origin with each partition
  5. Split the responses into small parts and update the cache with the fresh data (depending on our strategy)
  6. Merge all partition responses into a single response
  7. Return the single response

As you can see, this will both handle partitioning and caching on your behalf, your service implementation can focus on what it needs to solve with the data.

Cache refresh strategies

Cache-Only leans on these implementations to make the cache use flexible and customizable for your use-case.

The source code can be found here in case needed, but the following table will give a quick summary to let you choose.

Name Description
CACHE_ONLY Only calls the origin service when explicit cache refresh is performed. The items not found in cache will be simply skipped.
OPTIMISTIC Only calls the batch service for entities when they weren't found in the cache.
OPPORTUNISTIC Calls the batch service with the maximum amount of request items when we must call the service anyway in order to refresh some additional items on top of the missed ones.
PESSIMISTIC If any of the items were not found in the cache, it won't even try the rest of the cached items and will call the origin service for all of the items. This can reduce the overhead spent on caching when we know cache miss occurrences are likely signaling a larger number of items missing.
NEVER_CACHE Never uses cache for read or write. Useful when you want to use only the request partitioning and response merging functionality.