devlog/content/posts/hard-problem.md

3.5 KiB

+++ date = '2024-11-13T14:24:21+01:00' draft = true title = 'Hard Problem: Invalidating the browser cache' +++

I had a bit of an issue with my website recently.

I pushed some changes incorporating images for the first time (I know, very swish), and everything seemed to be working just fine, but when I loaded the production site in Firefox, the images were not styled. Stranger still, they were styled when I loaded the same page in Chrome.

The experienced computer touchers amongst you will be saying "this is obviously a cache problem", and you're right, it is obviously a cache problem. Pressing CTR + SHIFT + R (which forces Firefox to clear the cache and do a full reload) proved this thesis, and solved the problem handily for me, on my machine. But what about other people's machines?

Invalidating cached HTML

The best way to deal with this problem is to tell the browser not to cache our HTML in the first place. We can achieve this by adding the following meta tag to index.html, and any other HTML files we don't want cached.

  <meta http-equiv="pragma" content="no-cache" />

Invalidating cached CSS

A quick google search revealed that the best way to invalidate browser cache is by changing the url of the file you're telling it to load. So we would change this:

<link rel="stylesheet" href="css/defaults.css" />

to this:

<link rel="stylesheet" href="css/defaults-2.css" />

and the browser would recognize this as new file and load it from the server. Problem solved! Of course, you would have to change the file name too...

mv css/defaults.css css/defaults-2.css

... and this would get tedious very quickly. Furthermore, it's going to make a mess of your version history if, as far as Git is concerned, you're deleting the CSS file and writing a new one with every deployment. Surely there's a better way?

Using a query

Of course there is. Look at this:

<link rel="stylesheet" href="css/defaults.css?v=2"/>

As we're requesting the file via http, we can append a query. Awesome. Not awesome enough though. I'm too lazy to do this every time I push a commit, and, being human, I'll probably forget at a critical moment. This can only mean one thing. It's time to bash (🤣) out a quick build script!

#!/usr/bin/env bash
COMMIT="$(git rev-parse HEAD)"
sed -i "s/css?=\w*/css?v=${COMMIT}/g" index.html

Let's talk real quick about what's happening here:

COMMIT="$(git rev-parse HEAD)" gets the commit id from Git and assigns it the variable $COMMIT.

Then, sed -i "s/css?=\w*/css?v=${COMMIT}/g" index.html does a find and replace on index.html. The regular expression css?=\w* matches 'css?=' plus any number of contiguous alphanumeric characters (everything until the next quote mark, basically) before replacing these alphanumeric characters with the commit id. The flag -i tells sed to edit the file in place. The g tells it to perform the operation on the whole file.

Now, whenever we push a new commit, any CSS imports in index.html will be changed to something like this:

<link rel="stylesheet" href="css/styles.css?v=ab10c24280844c10c10c1adfb8b85b03b316f72b" />

Pretty neat, huh?

There's just one thing bugging me: surely I do actually want the CSS to be cached sometimes. Caching exists for a reason, and I don't want to sacrifice performance. Maybe I can modify the build script so that it only updates the CSS imports when the CSS files have changed... Sounds like a topic for another blogpost!