Skip to main content

Caching Search Results

Depending on how you implement searches (if you choose to do so), you will need to cache the results temporarily somewhere. Below is an explanation of the two main ways you might go about this.

Option 1: Using a pre-built search solution

SaaS (software-as-a-service) providers often provide search features for websites that you can integrate. These will usually have a caching solution built into them, in which case you won't need to think about this. If they don't, you will likely need to configure them to use your own caching server for this purpose.

Option 2: Doing it yourself

To cache the results of search queries yourself, you'll need to manage the entries in your cache server yourself.

When a user runs some query you want to cache, you should store the results of that query - in some minimised form - in your caching server, with an expiry time. For example, if a user searches your website for "huge anime tiddies", you should add a corresponding cache entry that expires in some fairly short period (e.g. 15 minutes). 

In Redis, a simple example would look something like this:

# Set the field "PAGE:0:25" of hash "SEARCH:ARTWORKS:HUGE ANIME TIDDIES" to the JSON value.
# We have used a naming convention of "PAGE:OFFSET:LIMIT" for the page here.
HSET "SEARCH:ARTWORKS:HUGE ANIME TIDDIES" "PAGE:0:25" "<the search results in JSON form>"
# Set the hash to expire in 15 minutes.
# The NX option specifies to do nothing here if an expiry time is already set, i.e. not to overwrite it.
EXPIRE "SEARCH:ARTWORKS:HUGE ANIME TIDDIES" 900 NX

We're storing the search results for 15 minutes, and telling Redis not to overwrite the expiry time if it's already been set. This may seem unintuitive, but this is because search results in particular require some extra management.

Caching multiple pages of search results

When a user searches for "huge anime tiddies", there might well be 50,000 results. Your website is obviously not going to be returning and displaying all 50,000 entries in the first page of results; it will perhaps show the first 25 results, and when the user scrolls down or clicks to the next page, show the next 25 results.

Since your database is going to be performing searches that return a specific "page" in this context (using LIMIT and OFFSET clauses), we need to use a hash set in Redis so that we store each cached page separately. Otherwise, a user browsing to the next page of results would overwrite the cache entry with the current page. This means the hash "SEARCH:ARTWORKS:HUGE ANIME TIDDIES" is going to have one or several fields, each representing a page of search results.

If we didn't use the NX option to prevent re-applying the expiry to the hash, then so long as a user searched your website for the same key more than once per 15 minutes, the cached result would keep having its expiry updated, and users would see a lot of out-of-date results. What we want is that, 15 minutes after the hash has been created, it expires, and any subsequent search gets a fresh result from the database server.

In Redis terms, using EXPIRE on a hash means that when it expires, all fields within it expire as well - you can't expire an individual field. This is good here, because otherwise it'd make for a weird situation where some result pages were cached and others weren't, leading to a very confusing browsing experience.

Displaying the cached results

When a user searches, you want to check if there is a cached result before querying the database.

In this example, let's suppose a user went to get some coffee after browsing through huge anime tiddies, and clicks to go to page 4 of the search results. They are therefore querying for the search results between entries 75-99, i.e. LIMIT 25 OFFSET 75 in your corresponding SQL query.

To check for a cached entry in Redis, you would use:

# Get the hash "SEARCH:ARTWORKS:HUGE ANIME TIDDIES", and the value of field "PAGE:75:25".
HGET "SEARCH:ARTWORKS:HUGE ANIME TIDDIES" "PAGE:75:25"

This will either return a JSON string (the search results) if a cached entry exists, or null if it doesn't. You can then either decode the JSON and use it to show results to the user, or perform the database query if the result didn't exist, and add that result to the cache.