Skip to main content

Introduction to caching

Caching is a fairly big topic, but for the purposes of this chapter, we are looking at server-side caching. That means your webservers, cache servers and other infrastructure - i.e. not your user's computer - performing caching.

The one we will pay the most attention to here is caching data in a dedicated cache server, such as a Redis server. This is very beneficial for any website that serves dynamic content, as it relieves a lot of your strain on your other infrastructure, especially your database server, in the context of hard work like performing database searches.

Although databases are made to be extremely efficient, large and complex search queries can take considerable time. On its own that isn't a big deal - you'll rarely see it takes ages in the console outside of a huge production database - but if for example your users are running a ton of searches, and you don't cache them, it can quickly bring your database server to its knees.

The objective: caching data that doesn't need to be up-to-date

For example: let's suppose a user on your website searches for "huge anime tiddies". Your database goes and searches a gigantic table of data for information matching that query, and returns the result to the user.

The user is not going to be interested in whether searching for "huge anime tiddies" at, say, 17.59PM on a Monday has a different result than if they searched at 17.58PM that same Monday. For one, they don't even know if anybody uploaded new content matching that term in that timeframe, but more importantly, it's not necessary.

When you have a few users, this doesn't matter: the load is small. When you have, say, a million users - and 10,000 of them are searching for anime tiddies at any given time - the load is huge. Your database server is performing a ton of searches, likely returning exactly the same results, yet performing all the same expensive computations and checking the entire contents of the database tables involved in the query every time.

Instead, it makes sense to cache the results of the query for a given time period, depending on how up-to-date the information needs to be. If you cache it for 15 minutes, then the query is only run once every 15 minutes at most; anyone searching for "huge anime tiddies" when there is already a cached result gets the cached result. Unlike your database server, the cache is not searching for anything or performing any computation; it's just returning the result it already has. This is much more efficient for your website, resulting in faster searches and less strain on your servers.