Postgres 18 shipped asynchronous I/O, and the headline numbers are striking: sequential scans two to three times faster, vacuum finishing in a fraction of the time. The release notes are not exaggerating. The benchmarks are real. And for most of the slow queries you actually have, none of it will move the needle.
That gap is the whole point of this post. Teams are going to upgrade to 18 expecting their p99 to drop, because async I/O sounds like the kind of thing that makes everything faster. It is not. It makes one specific thing faster, and whether that thing is your bottleneck is a question you can answer before you upgrade.
What actually shipped
Before 18, a backend reading pages off disk worked one block at a time. Ask the kernel for a page, wait for it, get it, ask for the next. Synchronous. A single backend kept essentially one read outstanding at a time, leaning on OS readahead to paper over the latency.
Version 18 adds an asynchronous I/O subsystem for the read path. A new setting, io_method, controls it, with three modes:
SHOW io_method; -- 'worker' by default
SET io_method = 'io_uring'; -- Linux only, needs a recent kernel
SHOW effective_io_concurrency; -- 16 in PG18, was 1 beforesync is the old behavior. worker is the new default: dedicated I/O worker processes pull requests from a shared queue and do the reads while backends keep working. io_uring hands the requests straight to the Linux kernel interface with no worker hop. Alongside it, effective_io_concurrency jumped from a default of 1 to 16, which is the number of read requests the executor will keep in flight at once.
The subsystem covers sequential scans, bitmap heap scans, and vacuum. Write-path I/O is still synchronous in 18. This is a read-side feature.
What async I/O actually does
The job it does is latency hiding. A single read from network-attached cloud storage can take hundreds of microseconds, sometimes more than a millisecond. Issue those reads one at a time and a large scan spends almost all of its wall-clock time waiting on the storage instead of using it. Async I/O lets one backend keep many reads in flight, so the device is already working on requests 2 through 16 while request 1 comes back.
Notice what that requires to pay off: a lot of pages to read, and storage that gets faster when you ask it for many things at once. When both are true, throughput climbs and the scan finishes in a fraction of the time. When either is false, there is nothing for the feature to hide.
Where it genuinely helps
The clear wins are large reads over data that does not fit in cache. A sequential scan over a table bigger than shared_buffers. A bitmap heap scan touching thousands of scattered pages. A vacuum walking pages it has not seen in a while.
The size of the win tracks your storage. On cloud network-attached disks, the EBS class of device, per-request latency is high and you usually have bandwidth you were never using, so overlapping reads buys you a lot. A sequential scan there can genuinely run two to three times faster. Run the same scan against a local NVMe drive, which already has low latency and deep hardware queues, and the improvement shrinks to something you might not notice. Same Postgres, same query, different ceiling, because the bottleneck the feature removes was only large on one of them.
The slow query you actually have
Async I/O accelerates one thing: pulling a lot of cold data pages off slow storage during a scan. Now count how many of your actual slow queries are that.
Missing index. The query is slow because it sequential-scans a million rows to return five. Async I/O makes the wrong plan run faster. The right fix is the index that turns it into a lookup of five pages, and reading five pages was never I/O-bound to begin with.
N+1. The ORM fires one query for the list and one more for every row, a thousand tiny round trips, each reading a handful of pages that are probably already cached. Async I/O does not merge separate queries into one. Every individual query was already fast. The cost lives in the thousand round trips, which is a query-count problem.
Cache hits. If the pages a query needs are already in shared_buffers, there is no disk read to make asynchronous. A working set that fits in RAM gets exactly nothing from this feature, no matter how many cores or how fast the disk.
Lock waits. A statement blocked on a row lock is not waiting on storage. It is waiting on another transaction to commit. Faster reads do not shorten that wait by a microsecond.
Stale statistics. The planner picked a nested loop where a hash join belonged, because the row estimates were a release old. Async I/O speeds up the reads inside a bad plan. It does not choose a better one. ANALYZE does.
Five common shapes of slow query, and the new subsystem addresses none of them. It addresses the sixth: a query that honestly has to drag many cold pages off a slow device.
The operational fine print
A few things worth knowing before you plan around it.
io_uring is Linux only and depends on a recent enough kernel, and some environments disable it outright. Worker mode runs everywhere, which is why it is the default. If you are on a managed provider, check whether io_method is even exposed and which mode ships, because not every platform offers io_uring.
Worker mode adds I/O worker processes. They show up in pg_stat_activity:
SELECT backend_type, wait_event_type, wait_event
FROM pg_stat_activity
WHERE backend_type = 'io worker';They are not client connections and do not count against max_connections, but they are real processes on the box, so account for them when you reason about memory and scheduling.
And effective_io_concurrency = 16 is a default, not a verdict. High-latency cloud storage may want it higher; a single spinning disk wants it much lower. The number that helps your workload is the one you measured, not the one you inherited.
How to know if it is your lever
Before you upgrade for this, look at where the time actually goes. Sample your wait events. If IO / DataFileRead is the dominant wait on the queries that hurt, you are reading cold pages off disk and async I/O can help you.
The sharper test is one query at a time:
EXPLAIN (ANALYZE, BUFFERS)
SELECT ...;
-- 'Buffers: shared read=NNN' is real disk reads
-- 'Buffers: shared hit=NNN' is already in cacheA plan with a large shared read count on a big scan is I/O-bound, and that is exactly what async I/O was built for. A plan that is mostly shared hit, or returns a handful of rows, has its bottleneck somewhere upstream of the disk, and no I/O method will touch it.
Postgres 18's async I/O is some of the best engineering in the release, and on the workload it targets it delivers everything the benchmarks promise. The mistake is assuming your workload is that workload. Upgrade for it when your wait events say you are pulling cold pages off slow storage. For the missing index, the N+1, and the lock, the fix is the same as it was ten versions ago.
Comments (0)