← Back to Blog

// Posted by Umur Inan

// Category Backend

// Posted on April 29, 2026

The Cache-Control Header You're Probably Ignoring

Most developers set max-age and call it done. The directives that matter for CDN behavior, revalidation, and stale content are all sitting there unused.

By Umur Inan · 6 min read

I deployed a change to our homepage copy. The deploy finished in two minutes. I sent the link to a colleague for a quick review before the announcement. She saw the old version. I told her to hard refresh. Still old. I checked the server - the new content was there. I checked the CDN - it was serving the old content, happily, with a remaining TTL of 81,000 seconds. Twenty-two hours. I had set it that way myself, months ago, and completely forgotten.

That is the thing about HTTP caching. You set a header, it works, and then you forget it exists until it does something surprising. Most developers I know have exactly one Cache-Control strategy: set max-age to something and ship it. Which is fine, until you realize you have no idea what your CDN is doing, your users are seeing stale HTML, and your deploy pipeline is not the bottleneck - your cache is.

Here is what the header actually does, and the parts most people skip.

There Are Two Audiences

Cache-Control talks to two different caches at the same time: the browser cache and any intermediate caches between the server and the browser, which mostly means CDNs and reverse proxies. The directives you set apply to both, unless you say otherwise.

Why does this matter? Because the right policy for a browser cache and the right policy for a CDN are often different. Your users' browsers can cache their own session data just fine. Your CDN should not cache it at all - it would serve one user's session to another user, which is a serious problem.

public and private are the directives that draw this line.

Cache-Control: private, max-age=3600

Browsers can cache the response for an hour. Every CDN and proxy in between gets told not to cache it at all. Use this for authenticated responses, user-specific data, anything personalized.

Cache-Control: public, max-age=3600

This tells everyone - browser and CDN alike - to cache for an hour. Use this for content that is the same for every user: static assets, public API responses, marketing pages.

Most developers skip public and private entirely and just set max-age. The default behavior when you do this depends on the CDN. Some treat a missing public as cacheable, some do not. You are relying on a default you probably have not read.

max-age and s-maxage Are Not the Same

If you want different TTLs for the browser and the CDN, s-maxage is what you want. It sets the TTL for shared caches (CDNs, proxies) and overrides max-age for those caches.

Cache-Control: public, max-age=60, s-maxage=86400

Browsers cache for one minute, so users get fresh content within a minute of a change. The CDN caches for a day, so your origin server is not hammered by traffic. The CDN serves the cached version to all users. Each user's browser refreshes from the CDN every minute.

This combination is underused. A lot of teams either cache everything the same way or skip CDN caching entirely because they are worried about stale content. s-maxage gives you the traffic reduction of long CDN caching without forcing long TTLs on browsers.

no-cache Does Not Mean Do Not Cache

This one trips people up constantly. no-cache does not mean the response is uncached. It means the cache must revalidate the response with the origin server before serving it. The response gets cached. It just doesn't get served without checking first.

Cache-Control: no-cache

The browser caches the response. On the next request, the browser sends a conditional request asking "has this changed?" The server responds with either a 304 Not Modified (browser uses its cached copy, nothing sent over the wire) or a fresh 200 with new content. You save bandwidth but not round trips.

If you actually want the response to never be stored anywhere, you want no-store.

Cache-Control: no-store

Nothing is cached. Every request goes to the origin. Use this for sensitive data - financial details, medical records, anything you cannot risk being stored on a user's disk or an intermediate proxy.

Most of the time when developers say they want no-cache, they actually want no-store. And most of the time when they want no-store, they could get away with a short TTL and save themselves some traffic.

stale-while-revalidate: The One Worth Using

stale-while-revalidate is the directive I see used least often and find most useful in practice. It tells caches that once max-age expires, they can serve the stale response for a little longer while fetching a fresh one in the background.

Cache-Control: public, max-age=300, stale-while-revalidate=60

For five minutes, serve the cached response immediately. After five minutes, the cache is stale. For the next sixty seconds, keep serving the stale response but kick off a background refresh. The user gets a fast response. The cache gets updated. Nobody waits.

Without this, the request that arrives right after max-age expires waits for a full round trip to the origin. With it, that request still gets the cache (just slightly stale) while the update happens behind the scenes. For content that changes infrequently and where a few seconds of staleness is acceptable - blog posts, product pages, documentation - this is a solid default.

Versioned Assets Should Be Immutable

If you are hashing your asset filenames (and you should be), you have a different problem than staleness. The filename changes when the content changes. app.a3f2bc.js is never going to become stale - if the JavaScript changes, it gets a new hash, a new filename, a new URL. The old file is just gone.

For these, the right header is:

Cache-Control: public, max-age=31536000, immutable

One year. immutable tells the browser that this response will never change, so skip the revalidation requests on reload. The asset stays in cache until evicted. Every returning user gets it from their disk instantly, with zero network traffic.

The mistake is using long TTLs on files without content hashing. You set max-age=31536000 on app.js and deploy a fix. Users with cached copies get the old JavaScript for up to a year. Always hash the filename, then make it immutable.

ETags: Getting 304s Instead of 200s

ETags are the server's way of giving a response a fingerprint. When the browser has a cached response with an ETag, it sends that tag back on the next request. Has the content changed? If not, the server responds with 304 Not Modified and an empty body, and the browser uses its cached copy.

// Server sends on first response:
ETag: "abc123"
Cache-Control: no-cache

// Browser revalidates with:
If-None-Match: "abc123"

// Server responds if unchanged:
HTTP/1.1 304 Not Modified

This matters most for large responses where the round trip is cheap but the payload is not. A 200KB API response that has not changed costs nothing to send when you have ETags. Most web frameworks generate ETags automatically based on response content, so this is often free if you just let the framework do it.

A Practical Template

This is roughly how I think about Cache-Control by content type:

# HTML pages - always revalidate, use ETags for efficiency
Cache-Control: no-cache

# Public API responses
Cache-Control: public, max-age=60, stale-while-revalidate=30

# User-specific API responses
Cache-Control: private, max-age=300

# Versioned static assets (hashed filenames)
Cache-Control: public, max-age=31536000, immutable

# Sensitive data
Cache-Control: no-store

HTML pages with no-cache means users always get fresh content on navigation, and ETags keep revalidation fast when nothing has changed. Versioned assets with immutable means zero network traffic for returning users. Public API responses with stale-while-revalidate means consistent response times without hammering your origin.

The Cache-Control header is not complicated. It is a set of knobs that most people set once, forget about, and then wonder why their deployments take an hour to reach users. Spend ten minutes checking what your responses actually send in production. You will probably find something surprising.

Umur Inan

Principal Software Engineer

Backend engineer focused on JVM systems, distributed architecture, and the failure modes that only show up in production. I write about what I learn building and breaking things at scale.

GitHub LinkedIn Email

👁 0 6 min read