Journal of Distributed Software Engineering, Architecture and Design
Common Service Caching Patterns
<div class="cs-rating pd-rating" id="pd_rating_holder_1819065_post_1351"></div>
<p class="wp-block-paragraph">Caché implies to hide something. In a technology context this is “some service hides some data for a period of time (minutes to years)”</p>
<p class="wp-block-paragraph">A cache is a bit of information we stash away to better serve the clients using our service. Some of the questions cache implementers face are: </p>
<ul class="wp-block-list"><li>Cache expiry and invalidation</li><li>Mechanism for refreshing a cache – push or pull</li><li>Cache sizing</li><li>Data security if the cache is long-lived or contains sensitive data</li></ul>
<p class="wp-block-paragraph">These factors help determine the caching strategy we want to apply and in general we can imagine there being attributes related to space, time and refresh mechanism which influence our strategy. Therefore we can talk about – Near or Far cache, aside or source-of-record cache, push or pull to refresh a cache</p>
<p class="wp-block-paragraph">A caching solution implementation hold this information in memory (volatile) or offload to a persistent store. When not holding the true copy of the information, invalidation or refreshing becomes an issue</p>
<h4 class="wp-block-heading">Why cache in our application?</h4>
<p class="wp-block-paragraph">We cache information for high availability and responsiveness. Mobile applications, for example, use a local storage on the device vs invoking remote APIs to save on network calls</p>
<p class="wp-block-paragraph">Caching is therefore a selection of Availability in a Partition sacrificing Consistency (C.A.P). Note, we shall soon see some caches are built as Operational Data Stores (ODS) – meaning they are consistent</p>
<figure class="wp-block-image size-large is-resized"><img src="https://alok-mishra.com/wp-content/uploads/2020/08/screen-shot-2020-08-26-at-9.55.49-pm.png?w=542" alt="" class="wp-image-1528" width="398" height="495" /><figcaption>Example of two microservices for a Portal channel serving straight from the system of record</figcaption></figure>
<p class="wp-block-paragraph">Caches make applications feel snappy and responsive but caches can provide stale information if not refreshed</p>
<h2 class="wp-block-heading">Cache types and patterns</h2>
<ul class="wp-block-list"><li><strong>Persistent vs Volatile</strong>: Does it remain when the power is turned off?</li><li><strong>Cache-aside vs Operational data</strong>: Is it the source of truth or does it periodically refresh this information from another source</li></ul>
<figure class="wp-block-image size-large"><img src="https://alok-mishra.com/wp-content/uploads/2020/08/screen-shot-2020-08-26-at-10.03.56-pm.png?w=464" alt="" class="wp-image-1530" /><figcaption>Cache Storage types</figcaption></figure>
<h2 class="wp-block-heading">Invalidating a cache</h2>
<p class="wp-block-paragraph">Simply the hardest problem in Computer Science. Knowing when to clear your copy of the data is key, especially when you do not master the information</p>
<p class="wp-block-paragraph">Some of the invalidation strategies are</p>
<ul class="wp-block-list"><li>Events from the master: Subscribe to refresh events or full updates from master system</li><li>Periodic refreshes: Use a timer to refresh your cache (especially if there are known update cycles in the master data system)</li><li>Explicit invalidation: Use an API to clear your cache</li></ul>
<figure class="wp-block-image size-large"><img src="https://alok-mishra.com/wp-content/uploads/2020/08/screen-shot-2020-08-26-at-10.08.56-pm.png?w=992" alt="" class="wp-image-1532" /><figcaption>Cache implementation patterns</figcaption></figure>
<h2 class="wp-block-heading">Know when to cache ’em</h2>
<ol class="wp-block-list"><li><strong>Availability</strong>: It is 2020 and your users want a response now! Drop-downs need to be snappy, lookups in O(1) and fewer network calls across the pond</li><li><strong>Reliability</strong>: You like your consumers but not enough to let them smash the heck out of your core business systems. Self-service portals/mobile-apps backed by APIs are more vulnerable to scripted attacks and if your service is lazy and always going to the system-of-record then you are risking an outage</li><li><strong>De-Coupling</strong>: Separation of concerns – Command vs Query. You want to isolate the process that accepts a request to “creating/updating” vs “reading” data to reduce coupling the two contexts. This is to prevent scenarios where a sudden rush of users trying to read the state of a transaction do not block the creation of new transactions. For example, Order booking can continue even if there is a flood of requests for order queries</li></ol>
<figure class="wp-block-image size-large is-resized"><img src="https://alok-mishra.com/wp-content/uploads/2020/08/screen-shot-2020-08-26-at-9.56.05-pm.png?w=856" alt="" class="wp-image-1526" width="509" height="416" /><figcaption>Applying Caching on Queries</figcaption></figure>
<h2 class="wp-block-heading">Know when not to cache</h2>
<ol class="wp-block-list"><li><strong>Consistency</strong>: The application needs to serve the current true state of the information (How much money to I have now vs my transaction history)</li><li><strong>Cache Key Complexity</strong>: <em>Searches</em> are hard to cache because they generate large and complex set of search keys for the results. For example, consider implementing a cache for a <em>type-ahead</em> search where each word typed is a callout to a search API. The result set for each word would require a large memory footprint and is notoriously hard to size for. A better approach is to only cache the individual item (resources) returned in the result set or array and not cache the search result array</li><li><strong>Ambiguity</strong>: This relates to consistency. If you do know know when to refresh your cache, especially if you are not the master system or if this information changes in real-time then look for other solution options. For example, a website has a system that updates a user’s account balance in real-time (betting and gambling?) and the Account API for the user is looking to scale to hundreds or thousands of user requests per second (Melbourne cup day?) – would you cache the user’s account (money) information or look at some other strategy (streaming, HTTP push)</li></ol>
<h2 class="wp-block-heading">Summary</h2>
<p class="wp-block-paragraph">One of the API anti-patterns is going straight to the system of record data, especially for retrieving data in public facing web applications. The best way to serve information is to understand where on the spectrum of static to dynamic does it sit and then implement a solution to serve the data with the highest degree of consistency</p>
<p class="wp-block-paragraph">Static non-changing resources are served by CDNs. The next layer is reliant on your APIs and how effectively you implement caching. Hope you got a taste of the types of cache and strategies in this post. There is certainly a lot more to caching than I have talked about here, the internet is a giant knowledge cache – happing searching!</p>
<p class="wp-block-paragraph"></p>
Caché implies to hide something. In a technology context this is “some service hides some data for a period of time (minutes to years)”
A cache is a bit of information we stash away to better serve the clients using our service. Some of the questions cache implementers face are:
Cache expiry and invalidation
Mechanism for refreshing a cache – push or pull
Cache sizing
Data security if the cache is long-lived or contains sensitive data
These factors help determine the caching strategy we want to apply and in general we can imagine there being attributes related to space, time and refresh mechanism which influence our strategy. Therefore we can talk about – Near or Far cache, aside or source-of-record cache, push or pull to refresh a cache
A caching solution implementation hold this information in memory (volatile) or offload to a persistent store. When not holding the true copy of the information, invalidation or refreshing becomes an issue
Why cache in our application?
We cache information for high availability and responsiveness. Mobile applications, for example, use a local storage on the device vs invoking remote APIs to save on network calls
Caching is therefore a selection of Availability in a Partition sacrificing Consistency (C.A.P). Note, we shall soon see some caches are built as Operational Data Stores (ODS) – meaning they are consistent
Example of two microservices for a Portal channel serving straight from the system of record
Caches make applications feel snappy and responsive but caches can provide stale information if not refreshed
Cache types and patterns
Persistent vs Volatile: Does it remain when the power is turned off?
Cache-aside vs Operational data: Is it the source of truth or does it periodically refresh this information from another source
Cache Storage types
Invalidating a cache
Simply the hardest problem in Computer Science. Knowing when to clear your copy of the data is key, especially when you do not master the information
Some of the invalidation strategies are
Events from the master: Subscribe to refresh events or full updates from master system
Periodic refreshes: Use a timer to refresh your cache (especially if there are known update cycles in the master data system)
Explicit invalidation: Use an API to clear your cache
Cache implementation patterns
Know when to cache ’em
Availability: It is 2020 and your users want a response now! Drop-downs need to be snappy, lookups in O(1) and fewer network calls across the pond
Reliability: You like your consumers but not enough to let them smash the heck out of your core business systems. Self-service portals/mobile-apps backed by APIs are more vulnerable to scripted attacks and if your service is lazy and always going to the system-of-record then you are risking an outage
De-Coupling: Separation of concerns – Command vs Query. You want to isolate the process that accepts a request to “creating/updating” vs “reading” data to reduce coupling the two contexts. This is to prevent scenarios where a sudden rush of users trying to read the state of a transaction do not block the creation of new transactions. For example, Order booking can continue even if there is a flood of requests for order queries
Applying Caching on Queries
Know when not to cache
Consistency: The application needs to serve the current true state of the information (How much money to I have now vs my transaction history)
Cache Key Complexity: Searches are hard to cache because they generate large and complex set of search keys for the results. For example, consider implementing a cache for a type-ahead search where each word typed is a callout to a search API. The result set for each word would require a large memory footprint and is notoriously hard to size for. A better approach is to only cache the individual item (resources) returned in the result set or array and not cache the search result array
Ambiguity: This relates to consistency. If you do know know when to refresh your cache, especially if you are not the master system or if this information changes in real-time then look for other solution options. For example, a website has a system that updates a user’s account balance in real-time (betting and gambling?) and the Account API for the user is looking to scale to hundreds or thousands of user requests per second (Melbourne cup day?) – would you cache the user’s account (money) information or look at some other strategy (streaming, HTTP push)
Summary
One of the API anti-patterns is going straight to the system of record data, especially for retrieving data in public facing web applications. The best way to serve information is to understand where on the spectrum of static to dynamic does it sit and then implement a solution to serve the data with the highest degree of consistency
Static non-changing resources are served by CDNs. The next layer is reliant on your APIs and how effectively you implement caching. Hope you got a taste of the types of cache and strategies in this post. There is certainly a lot more to caching than I have talked about here, the internet is a giant knowledge cache – happing searching!
Alok brings experience in engineering and architecting distributed software systems from over 20 years across industry and consulting. His posts focus on Systems Integration, API design, Microservices and Event driven systems, Modern Enterprise Architecture and other related topics
View all posts by alokmishra