← Back to Blog

Virtual Threads Scale. Structured Concurrency Keeps Them Correct.

Virtual threads made fan-out cheap but not correct. The failed subtask that never cancels its siblings, and how structured concurrency fixes the lifetimes.

The fan-out that scaled and lied

Our profile endpoint needed five things at once: the user record, their recent orders, loyalty points, active promotions, and a recommendation block. We had been fetching them one after another, and it was slow. When virtual threads landed we did the obvious thing and fanned the five calls out across an executor backed by virtual threads. Each request now ran its five downstream calls in parallel, latency dropped by more than half, and the change looked like a clean win.

It was a clean win until the recommendation service had a bad afternoon. One of the five calls started failing, and the endpoint kept returning profiles anyway, just without recommendations, which seemed fine. Then we noticed the box was running far more threads than it had requests. The failing call was throwing, we were reading the results we cared about and moving on, and the other calls for that request were still out there running to completion with nobody waiting on them. We had built a fan-out that scaled beautifully and leaked quietly.

Virtual threads gave us scale, and only scale

It helps to be precise about what Loom actually solved. A virtual thread is cheap. You can have a million of them, and when one blocks on I/O the JVM unmounts it from its carrier so the carrier goes and runs another. That is genuinely transformative for the kind of work a backend does, which is mostly waiting on other services and databases. The bounded thread pool you used to guard like a scarce resource simply stops being scarce.

That is the easy half, and it is the half that gets the headlines. Throughput went up, the pool tuning went away, and a blocking call stopped being a sin. None of it said anything about what happens when one of five parallel calls fails, or when the caller gives up, or when you need to cancel the whole group at once. Cheap threads made concurrency affordable. They did nothing to make it correct.

What a raw fan-out gets wrong

The trouble with submitting tasks to an executor is that the results have no relationship to each other or to you. Each one is an independent future floating in the pool, and the language has no idea they were meant to live and die together as a single unit of work.

var futures = calls.stream()
        .map(executor::submit)
        .toList();

var results = new ArrayList<Result>();
for (var f : futures) {
    results.add(f.get());   // the first failure throws here
}

Walk through what happens when the third call fails. The get() on that future throws, you fall out of the loop, and the first two results are discarded along with the two calls you had not waited on yet, which keep running. Nothing cancels them. If the caller upstream had already timed out and walked away, every one of these tasks is now orphaned work hammering a downstream service for a request nobody will ever read. There is no parent holding the group, so there is nothing to pull the plug.

Structured concurrency makes lifetimes a tree

Structured concurrency fixes this by giving the group a single owner and a single scope. You open a scope, fork the subtasks into it, and join. The scope owns every task you forked, and none of them outlive the block you opened them in.

try (var scope = StructuredTaskScope.open()) {
    Subtask<User> user = scope.fork(() -> fetchUser(id));
    Subtask<List<Order>> orders = scope.fork(() -> fetchOrders(id));

    scope.join();   // wait for both; if either fails, the other is cancelled
    return new Profile(user.get(), orders.get());
}

Now the shape of the code matches the shape of the work. The two subtasks are children of the scope, the scope cannot be left until both have finished or been cancelled, and the try-with-resources block guarantees cleanup on every path out. A fan-out is no longer a scattering of independent futures. It is a subtree with a root that is responsible for it.

Error propagation you do not hand-wire

With the default scope, the first subtask to fail shuts the scope down, which cancels the siblings still running and raises the failure out of join(). The half-built profile from the war story is no longer even expressible: either every child succeeded and you hold a whole result, or one failed, the rest were cancelled, and you are handling an exception. No path is left where you quietly return partial data because you forgot to check one future.

That is the same instinct behind making failures impossible to ignore that I have argued for elsewhere. The unstructured version made the wrong thing easy: drop a result, miss an error, leak a task. The structured version makes the right thing the default and the wrong thing hard to even write.

Cancellation that reaches the leaves

Because the scope owns the tree, cancellation flows down it. When one subtask fails and the scope shuts down, the interrupt reaches every sibling, not just the one you happened to be reading. Put a deadline on the scope and the whole fan-out honors it together, instead of each call timing out on its own schedule while the rest run on. When the request finishes, by success or failure or timeout, nothing it started is still alive. That one guarantee, that no work outlives the request that spawned it, is most of what made our leak impossible to reproduce once we moved over.

It is still preview, and that is fine

One honest caveat belongs here. Virtual threads are final and have been since Java 21, so you can lean on them in production today. Structured concurrency is still a preview feature, which means it wants --enable-preview and its API has moved between releases. Java 21 gave you a ShutdownOnFailure scope you constructed directly; Java 25 redesigned it around the open() factory and a Joiner that decides how results combine. The concept held stable across all of it. The exact method names did not, so wrap your fan-outs behind a thin helper and check the API for the JDK you actually run.

What I actually do

I reach for virtual threads without a second thought now, anywhere a request fans out to blocking I/O, because cheap threads are an unambiguous improvement. The moment a piece of work has more than one subtask, I put it inside a structured scope instead of scattering futures, so a failure cancels the siblings and the whole group lives and dies inside one block.

The framing that stuck with me is that virtual threads answered a question about cost, and structured concurrency answers a different question about correctness. Loom made it cheap to start a thousand threads. It is structure, not the threads, that keeps those thousand threads from becoming a thousand ways to leak work, swallow an error, and return a result that is quietly half wrong.

Share
X LinkedIn HN
UI

Umur Inan

Principal Software Engineer

Backend engineer focused on JVM systems, distributed architecture, and the failure modes that only show up in production. I write about what I learn building and breaking things at scale.

👁 0 5 min read

Comments (0)