← Back to Blog

// Posted by Umur Inan

// Category Backend

// Posted on April 7, 2026

Event Sourcing Sounds Better Than It Is

Event sourcing promises auditability, time travel, and decoupled systems. The operational complexity arrives later, and most teams are not ready for it.

By Umur Inan · 8 min read

I remember the first time event sourcing clicked for me. I was watching a conference talk, and the speaker was walking through the concept with a banking example. Every deposit, every withdrawal, every transfer, stored as an immutable event. The current balance is just a projection. You can replay the entire history. You can reconstruct the state of any account at any point in time. You can add new projections without touching the original data. It was elegant. It was powerful. It felt like the right way to build software.

Then I actually built a system with it.

That was a few years ago. I've since worked on two event-sourced systems in production, one I helped design and one I inherited, and I've watched several more from a close enough distance to understand what happens after the conference talk ends and the real users show up. The honest summary: event sourcing is genuinely useful for a narrow set of problems, and genuinely painful for everything else.

What Event Sourcing Actually Is

The idea is simple. Instead of storing the current state of your data, you store a sequence of events that produced that state. An order isn't a row in the orders table with a status column. It's a stream of events: OrderPlaced, PaymentConfirmed, ItemShipped, OrderDelivered. Current state is derived by replaying those events in order.

Usually this is paired with CQRS (Command Query Responsibility Segregation), where the write side processes commands and appends events, and the read side maintains separate, denormalized projections optimized for querying. Communication between the two sides happens through those events.

On paper, this gives you several things: a complete audit log by default, the ability to rebuild state from scratch, temporal queries (what did this look like last Tuesday?), and loose coupling between different parts of your system. Those are real benefits. The problem is the cost of getting them.

The Complexity You Don't See in the Blog Posts

Most writing about event sourcing focuses on the happy path. Here's what the blog posts often leave out.

Schema Evolution Is a Nightmare

In a traditional system, if you need to add a field to an order, you add a column, write a migration, and you're done. All existing rows get the new column. Your code works against one consistent schema.

In an event-sourced system, you can't change past events. They're immutable. So when your event schema needs to change, and it will, you have a few options, none of them pleasant. You can version your events and handle multiple versions in your projection logic. You can write a migration that replays history and transforms old events into new formats. You can use upcasting, where old events are transformed on read. Each approach has tradeoffs. All of them add code and cognitive load. And all of them mean that your projection logic has to handle events from multiple eras of your application's history indefinitely.

I've seen event stores with events going back four years, with five different versions of the same event type, and projection code full of version checks and format conversions. It's not a disaster, but it's not pretty either. Every new developer who joins has to understand not just how events work today, but how they worked in 2021, 2022, and 2023 too.

Debugging Is Harder, Not Easier

One of the selling points of event sourcing is auditability. And it's true that having a complete history of events is useful. But when something goes wrong, working backwards from a broken projection through a stream of events is not obviously easier than looking at a row in a database and a stack trace.

The question "why does this order show the wrong status?" in a traditional system is answered by looking at the orders table and the relevant logs. In an event-sourced system, it's answered by loading the event stream, replaying it, and figuring out which event or projection handler produced the wrong result. Problem in the projection logic? You need to understand how the projection was built. Missing event? You need to figure out why it wasn't appended. Event ordering? You need to understand your consistency guarantees.

The audit trail is useful when you need to understand what happened over time. It's less useful when you're trying to figure out why a specific piece of state is wrong right now.

Eventual Consistency Is Baked In

Because projections are built asynchronously from events, there's always a lag between when a command is processed and when the read model reflects the result. For most read operations, this is fine. For some, it isn't.

"Did my order go through?" is the kind of question users ask immediately after placing an order. If the answer depends on a projection that's a few hundred milliseconds behind the write side, you either accept that the read might be stale, add special-case synchronous handling for those queries, or engineer around it with tricks like read-your-own-writes. None of these are hard problems, but all of them require you to be conscious of the consistency model at every point in your application. That's cognitive overhead that doesn't go away.

Event Stores Need Care

An event store is not a regular database. It's append-only. It needs to support efficient streaming by aggregate ID. It needs to handle high write volumes without becoming a bottleneck. If you're using EventStoreDB, Kafka, or a custom solution on top of PostgreSQL, each has different operational characteristics, different failure modes, and different backup/restore procedures.

The PostgreSQL-based event store I inherited used a single events table with an aggregate_id column and a sequence number. It worked fine until the event table hit 50 million rows, and then query performance started degrading in unexpected ways because the projection rebuilds were doing full table scans. Fixing it required downtime. None of this would have been a problem with a boring relational schema and a few indexes.

Snapshots Are an Operational Tax

When an aggregate accumulates thousands of events, replaying all of them on every load becomes slow. The standard solution is snapshotting: periodically save the current state alongside the event stream so you can load the snapshot and only replay events since then.

Snapshots work. But they add another schema to maintain, another migration concern, and another piece of logic to keep consistent with your projection code. When a snapshot is stale or corrupt, debugging it means understanding both the snapshot format and the events that were applied after it. It's not a hard problem but it's another layer of complexity that you add because of event sourcing, not because of your domain.

The Problems It Solves Well

I want to be fair here. Event sourcing is genuinely good at some things.

If your domain naturally produces a stream of events, such as financial transactions, audit trails for compliance, workflow state machines, or collaborative editing, then event sourcing maps well onto the problem. You're not forcing an event model onto data that doesn't naturally have one. The events are the thing. The projections are secondary.

Strict regulatory requirements for auditability that you'd otherwise implement as a bolted-on audit log? Event sourcing gives you that for free. Your event stream is the audit trail. You don't need to maintain a separate system for it.

For systems where multiple teams need to build different views of the same data, and those views change independently, event sourcing lets each team build and evolve their projections without touching the write side. That's genuinely useful when it applies.

And if you're building a system where temporal queries are a core feature, not a nice-to-have but something users actually need regularly, event sourcing makes those queries natural instead of requiring you to engineer time-travel into a system that wasn't designed for it.

The pattern that emerges: event sourcing is a good fit when the events are the primary artifact and the state is derived from them. It's a bad fit when the state is the primary artifact and you're adding events as an audit mechanism.

What Most Teams Actually Need

Most applications don't need full event sourcing. They need two things that event sourcing bundles together and sells as a package.

The first is an audit log. If you need to know who changed what and when, add an audit log. It's a separate table with a foreign key to the affected entity, a timestamp, a user ID, and a JSON column with the before and after state. It takes a few hours to build, it's easy to query, and it doesn't affect your application's primary data model. You can add it to an existing application without rewriting anything.

The second is domain events for integration. If you want different parts of your system to react to things that happen, publish domain events. When an order is placed, a payment confirmed, or a user signed up, other parts of the system can subscribe and act. Use Kafka, a transactional outbox pattern, or even polling. You don't need to make your events the source of truth for state. You can store state normally and publish events as a side effect of state changes. This is simpler to operate, simpler to debug, and simpler to evolve.

The combination of a normal relational data model, a simple audit table, and domain events for integration will handle the requirements that most teams cite when they're considering event sourcing. And it will do it with far less moving parts to operate and debug.

The Decision Nobody Reverses

The most important thing to understand about event sourcing is that adopting it is largely irreversible. Your event store becomes the canonical source of truth for your data. The projections are derived from it. If you later decide event sourcing was the wrong choice, migrating back to a traditional data model means replaying your entire event history into a relational schema and then switching the write side over. This is possible but painful.

Compare this to most other architectural decisions. Choosing one framework over another, one database over another, even one service topology over another. These can usually be changed incrementally. Event sourcing, because it redefines where truth lives, is much harder to undo.

That's not a reason to never use it. It's a reason to be sure before you commit. And being sure means asking whether you genuinely have the problems event sourcing solves. Not whether event sourcing sounds like a good idea, not whether it worked at the company whose case study you read, but whether the specific problems it addresses are actually the problems you have.

The Talk Was Good Though

Here's the part I want to be honest about: event sourcing is intellectually compelling. The model is coherent. Its tradeoffs are defensible. And for the domains it fits, like financial systems, workflow engines, and collaborative tools, it really does produce cleaner designs than fighting against a mutable state model.

But most teams who adopt it aren't doing it because their domain demands it. They're doing it because the idea is appealing, because someone influential on the team learned about it and got excited, or because it felt like the sophisticated choice. That's how you end up with an e-commerce platform where the shopping cart is an event stream, and the team spends two days debugging why the cart projection disagrees with the inventory projection, and the answer is a race condition in the event consumer that was never obvious from the design.

Architectural decisions made because the idea is exciting often cost more than architectural decisions made because the problem requires it. Event sourcing is not exempt from this. The conference talk was good. The production system is where you find out if you actually needed it.

Event SourcingArchitecture Distributed Systems

Umur Inan

Principal Software Engineer

Backend engineer focused on JVM systems, distributed architecture, and the failure modes that only show up in production. I write about what I learn building and breaking things at scale.

GitHub LinkedIn Email

👁 0 8 min read