Browser Cache and Edge Cache Explained
You updated it? Well, yes, but not really.
08 Feb 2022
182 lines
Does the following sound familiar?
- "The new feature is live."
- "I can't see it."
- "Have you cleared your browser cache?"
- "Yes. It's still not visible."
- "Hmmm…"
This could happen when there are multiple layers of caching in play and one of them has not expired. If you've used something like Cloudflare CDN, you've likely ran into this.
Let's say your website has a logo in the header and it appears on all pages. When users land on your website and start browsing, they'll be downloading the same logo on every page load. This inefficiency increases bandwidth costs and load times.
When sending the logo to the user, your server can also provide the Cache-Control response header with an expiration value like max-age=604800. This will tell the browser to store the logo locally, on the user's hard drive, for 604800 seconds, which is 7 days. Until that time passes, whenever your website references the logo through the same URL, the user's browser will use its cached version. After that, the browser will request the logo from your server once again and cache it for another 7 days.
If your audience is mainly European, it would probably make sense to put your server in London, for example. But if your website starts receiving traffic from America, those requests would have to travel an entire ocean, which can make them several times slower, due to latency. With a CDN, you have a system of servers spread throughout the world that proxy user traffic to your server and solve this performance issue.
Among the hundreds of servers in a CDN, the geographically closest one to a user is called an edge and is the one that this user's browser directly communicates with:
Response flow from server to user over a CDN
While browser caching helps with many requests by the same user, it doesn't solve the issue of many users making requests, because each user has their own browser with their own cache. To fix this, you need a shared cache.
With a CDN, all user requests are funneled through the closest edge server. If that server caches the responses, it'll be able to serve those users without having to bother your own server. This means that it no longer matters if you have 1 user or 10000, because once that first response for a resource is served and cached, the load shifts from your server to the CDN, which can serve the other 9999 by itself.
Cloudflare caches only static assets by default, such as images, fonts, scripts, etc. The HTML containing the page content is not cached, because if you have an e-commerce store, for example, each user will have their own shopping cart, and if you cache that, the next user will see the previous one's items. On the other hand, an image is expected to be the same for every user and can safely be cached.
Sites without shopping carts and other dynamic content, such as blogs, can opt in to cache everything. This makes it possible for an edge to serve entire pages all by itself, without having to contact your server, as long as everything is cached. This way, you can have a very weak server, yet handle thousands of requests per second, because the load is mainly on the CDN.
The hard part in caching is picking a time to live (TTL), which determines when an entry is stale and should be discarded.
Images and other assets can be cached for longer periods, because even if you pick a way too long TTL, you can change the URL to the resource and trigger a fresh request-response cycle:
However, if you've chosen to cache the HTML content of your pages as well, you'd have to get around that too, since changing the image URL means changing the HTML. So you'd have to change the page URL as well:
This is where things get tricky, because even if your web framework can easily change all links to the updated page, every other site on the internet also has to do it. That's impossible, so you have to pick a window of time that you're comfortable with and in which users may see outdated content.
If you have the following setup:
- Browser cache TTL: 30 minutes
- Edge cache TTL: 1 minute
…there are several possible outcomes:
- If a change happened over 30 minutes ago, both caches would have expired and it's guaranteed that the user will see the updated version.
- If a change happened over 1 minute ago, only the edge cache is guaranteed to have expired. If a user has seen the page within the last 30 minutes and is cached by their browser, they can see the change by clearing their browser cache, opening an incognito window, or doing a hard reload .
- If a change happened less than 1 minute ago, both caches would still be considered fresh. Users will not be able to see the update, even if they clear their browser cache, because the closest edge server would still return the old version. The only solution here is for the website owner to prematurely force the cache to become stale by purging it .
Browser cache helps with subsequent requests by the same user and makes your website feel snappy, because your assets don't have to travel over the internet.
Edge cache helps with requests by a geographically close group of users and can greatly reduce server load while improving response times in distant countries.
Pick a lower TTL to reduce the chance of outdated content, or a higher TTL for better cache effectiveness, but be ready to change URLs, open incognito windows, and purge cache.