← Back to Blog

// Posted by Umur Inan

// Category Backend

// Posted on April 28, 2026

Why Your Service Slows Down at 9am Every Day

Your service slows every morning for the same five reasons: JVM warmup, cold caches, pool growth, clustered crons, deployment timing. Here's how to fix each.

By Umur Inan · 5 min read

You deploy at 8:50am on a Tuesday. Metrics look normal. You close your laptop and go get coffee. At 9:04, Slack goes off. "The app is slow." You check CPU: fine. Memory: fine. Error rate: below threshold. But p99 latency is sitting at four times its usual value. By 9:15, it's back to normal. You shrug, file it as a morning load spike, and move on.

It happens again the following Monday. Same time. Same pattern. Same resolution: wait ten minutes, it fixes itself.

This is not a load spike. Load spikes are random. This is predictable, reproducible, and has specific causes. There are five things that cause it, and they almost always occur together.

The JVM Isn't Ready Yet

When your Java process starts, the JVM runs in interpreter mode. It executes bytecode directly, without compiling it to native machine code. Interpreter mode is slow. 2–20× slower than compiled code depending on the code path.

The JIT compiler runs in the background, profiling which methods get called most frequently. Once a method crosses a threshold, the C1 compiler kicks in with lightly optimized native code. Later, the C2 compiler applies aggressive optimizations. Inlining, loop unrolling, escape analysis. C2-compiled code is the fast code you see in benchmarks.

On a typical Spring Boot application, you're looking at two to five minutes before the most important code paths are fully compiled. During those minutes, every request is slower than steady state. If your service deploys at 8:50 and traffic picks up at 9:00, users are hitting the application at the worst possible moment.

The simplest fix is synthetic warmup traffic. Before the pod joins the load balancer, send it a sample of representative requests. Just enough to trigger JIT compilation on your hot paths. Use Spring Boot's ApplicationReadyEvent to make internal calls to the most critical endpoints at startup. For the most latency-sensitive services, GraalVM native image compiles everything ahead of time and eliminates warmup entirely, at the cost of a more complex build pipeline.

The Cache Is Empty

Your application cached thousands of database records over yesterday. User data, configuration, product listings, permission sets. Cached with TTLs of one to four hours. By 9am, some entries have expired. Or you deployed overnight, which wiped the in-memory cache entirely.

The first users send requests that miss the cache. Each miss goes to the database. The database handles one query fine. But when hundreds of users arrive in the first few minutes, hundreds of misses hit simultaneously. Query times go up. The application waits. Latency rises.

For an in-memory cache like Caffeine, warm it proactively at startup using @EventListener(ApplicationReadyEvent.class) to pre-populate before the service starts handling requests. For Redis, add TTL jitter. A random offset of a few minutes on each entry. So expirations spread over time instead of clustering at the same moment. The worst case is a fresh overnight deployment where all TTLs also expire around the same time. I've seen this take p99 from 200ms to 4 seconds for fifteen minutes straight.

The Connection Pool Isn't Full

HikariCP starts with a minimum number of idle connections. Some teams configure minimumIdle lower than maximumPoolSize to save database connections during quiet periods. The pool shrinks overnight and hasn't grown back when morning traffic arrives. The first hundred requests all need connections. Each new connection requires a TCP handshake, authentication, and setup. That overhead adds latency to the first requests that trigger pool growth.

The fix: set minimumIdle equal to maximumPoolSize and set initializationFailTimeout to a positive value so HikariCP establishes all connections at startup and fails fast if it can't. You pay the connection cost once at startup instead of spreading it across the first wave of production requests.

Also watch for the deployment overlap problem. When rolling out a new version, new instances start while old instances drain. For a few minutes you have twice the usual application count, each with its own pool. If that pushes you above the database's max_connections, new connections fail. Right when morning traffic arrives.

Everything Scheduled at 9am

Cron expressions that fire at the top of the hour are everywhere. Configuration cache refresh: 0 0 9 * * ?. Overnight order batch: 0 0 9 * * ?. Report generator: 0 0 9 * * MON. Three different jobs, three different teams, all scheduled at exactly 9:00am.

At 9:00:00, all three fire simultaneously. Configuration refresh does a bulk read of every entry. Batch job processes ten thousand order records. Reporting runs a slow analytical query across six months of data. From quiet to heavily loaded in an instant, and that load overlaps perfectly with the arrival of the first morning users.

The fix is staggering. Spread jobs across a time window. Five jobs that logically need to run "in the morning" can run at 8:57, 9:02, 9:07, 9:12, and 9:17. No user will notice. The database will. In Spring Boot, @Scheduled with initialDelay can spread jobs across a startup window. For jobs that genuinely must fire at a specific time, pre-compute the expensive database work ahead of time and have the scheduled job only publish the pre-computed result.

The Deployment Timing Problem

Low-traffic deployments mean cold-state deployments. Every JIT cache, every application cache, every connection pool warmup. All of it happens right before the highest-traffic period of the day. An application that deployed at 3am has been sitting in a warmed state for six hours before the workday starts. One that deployed at 8:45am has fifteen minutes.

One approach is explicit warmup after deployment: a readiness probe that only passes after a warmup endpoint returns successfully. The warmup endpoint sends internal requests to critical paths, populates caches, and establishes connections. The pod doesn't join the load balancer until that work is done.

A better approach for high-stakes services is canary deployments. Route two to five percent of real traffic to the new instance while it warms up. The canary handles enough real requests to trigger JIT compilation and cache population, but it's a small enough slice that slow responses during warmup are barely visible in aggregate metrics.

What I Do Now

Stagger your cron jobs. Pick specific times instead of round hours. A one-line change to a cron expression takes fifteen seconds and permanently removes that source of contention.

Set minimumIdle equal to maximumPoolSize in HikariCP. Your database holds the connections open whether your app uses them or not. You might as well have them ready.

Warmup logic belongs in your ApplicationReadyEvent handler. Pre-populate the most expensive cache entries. Hit the most critical endpoints internally. Covering the top ten cache keys and the three most expensive database queries is usually enough to flatten the cold-start curve.

Readiness probes need to mean something. A pod is not ready when its HTTP port opens. It's ready when it's actually ready to serve traffic at normal performance. Those are different things.

Watch p99 latency by time of day, not just in aggregate. A dashboard that shows daily average latency will never surface the 9am problem. A chart that overlays multiple days makes it obvious. Once you can see it clearly, you can measure whether your fixes actually worked.

The 9am slowdown is not mysterious. JIT warmup, cold caches, growing connection pools, clustered cron jobs: all of them firing at once because that's when traffic returns. Each cause is fixable in isolation. Fix all of them and mornings stop being eventful.

Umur Inan

Principal Software Engineer

Backend engineer focused on JVM systems, distributed architecture, and the failure modes that only show up in production. I write about what I learn building and breaking things at scale.

GitHub LinkedIn Email

👁 0 5 min read