Microservices with Spring Boot 4

Umur Inan

Preface

I didn’t set out to write a book about microservices. I set out to understand why our system kept breaking in ways that were impossible to debug.

We had services. They talked to each other. They shared databases. They failed together. We called it a microservices architecture, but what we’d actually built was a distributed monolith: all the operational complexity of distributed systems, none of the independence.

The problems weren’t bugs. The code worked. The problems were structural: a deployment that required coordinating three teams because one shared library connected everything, a database that became a bottleneck because four services were writing to the same tables under load, a cascade failure that started with a slow search query and took the entire recommendation engine with it.

The answers were out there — bounded contexts, saga pattern, outbox pattern, database per service. But the books and blog posts either stopped at theory or jumped to “here’s a Kubernetes YAML.” The messy middle — how to actually implement these patterns in Spring Boot, what the trade-offs are, when to skip a pattern entirely — was harder to find.

This book is that messy middle.

What This Book Is

This is a hands-on guide to building production-grade microservices with Spring Boot 4.0.5. Every chapter covers one problem: how services find each other, how they handle distributed data, how they fail safely, how you deploy them without downtime. Every solution comes with real Spring Boot code, real trade-offs, and the failure modes you’ll hit in production.

The goal is not to convince you that microservices are always the right answer. They aren’t. The first chapter makes that argument honestly. The rest of the book assumes you’ve decided to build them anyway, and helps you do it well.

Who This Book Is For

This book is for engineers who have already built things with Spring Boot and want to go deeper into distributed systems. You know how to wire up a REST API, configure Spring Security, and write JPA repositories. You’ve probably heard of Kafka, maybe used it, but you want to understand the patterns that make it work at scale.

You don’t need to have built microservices before. You do need to be comfortable with Java and Spring Boot fundamentals. If you’re not, the Spring Boot 4 book in this series covers that ground.

How to Use This Book

Chapters 1 through 9 build on each other — read them in order. They cover the decision to decompose, how to find service boundaries, and the infrastructure every service needs: discovery, gateway, configuration, resilience, and observability.

Chapters 10 through 14 cover distributed data: the patterns for managing state when it lives in multiple databases across multiple services. Read them in order if data consistency is new to you.

Chapters 15 through 20 cover security, testing, and deployment. These can be read independently once you’ve finished the data chapters.

Chapters 21 through 34 are advanced patterns: migration from a monolith, multi-tenancy, reactive systems, chaos engineering. Read whichever chapters match your immediate problems.

A Note on Spring Boot 4.0.5

All examples use Spring Boot 4.0.5 with Java 21. Spring Boot 4 defaults to virtual threads via Project Loom, which changes how you think about concurrency in services. We cover that explicitly. The Spring Cloud dependencies are all compatible with Spring Boot 4.

The Code

Every chapter has a companion project in the code/ directory. The code/final/ directory contains the complete CinéTrack microservices system: API Gateway, User Service, Catalog Service, Watchlist Service, Review Service, Recommendation Service, Notification Service, and Search Service. You can run the whole system with Docker Compose.

git clone https://github.com/umur/microservices-example
cd microservices-example/final
docker-compose up -d

Go to http://localhost:8080 — you’ll see the API Gateway routing requests to the services behind it. Search for a film, add it to your watchlist, post a review. Eight services, one system.

Acknowledgments

The bugs are mine. The good ideas are everyone else’s, often without attribution because I forgot where I first heard them. If you recognize one of yours, please write to me; I would like to credit you in the next edition.

1 The Microservices Decision

1.1 Overview

“A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.” Leslie Lamport, 1987

A team I know spent three months building microservices for a social media startup. Eight services. Kubernetes. Service mesh. The works. They launched with forty users. Six months later, half the engineers had left, the remaining two spent most of their time debugging cross-service failures, and the founder was seriously considering rewriting everything as a monolith.

They hadn’t built a distributed system because their problem demanded one. They’d built it because microservices felt like the right thing to do in 2024. Like what serious engineers do.

That’s not an unusual story. The pattern repeats constantly: a team adopts microservices before they have the problems microservices solve, and then spends a year living with all the costs and none of the benefits.

This chapter is about making the decision honestly. Not “should we use microservices?” in the abstract, but: does your situation actually warrant the trade-offs? Because the trade-offs are real, they’re heavy, and the industry has spent the last decade systematically underselling them.

We’ll look at the full cost of distributed systems before we talk about the benefits. We’ll look at when the monolith is the correct answer. And we’ll introduce CinéTrack, the system we’ll decompose together throughout this book, and be honest about what that decomposition costs us.

By the end of this chapter, you’ll understand:

The distributed systems tax: the unavoidable costs of network latency, partial failure, operational complexity, and eventual consistency that every microservices deployment pays
When the monolith wins: team size thresholds, traffic patterns, and organizational signals that make a modular monolith the smarter default
Conway’s Law: why your org chart will eventually win against your architecture diagram, and how to use that fact intentionally
Readiness signals: the three concrete problems that tell you decomposition is worth it, and why “our codebase is big” isn’t one of them
CinéTrack decomposed: the eight services we’ll build throughout this book, why each one is a natural boundary, and what decomposition costs us compared to the monolith

1.2 The Distributed Systems Tax

Before we talk about service boundaries and team topologies, let’s talk about what you’re actually signing up for. Microservices come with a tax. You pay it whether or not you benefit from the architecture. Most introductions to microservices bury this. I’m not going to.

1.2.1 Network Latency

An in-process method call takes roughly one microsecond. A network call to a service running on the same machine takes roughly one millisecond. That’s a 1,000x difference. Over a real network, between availability zones or data centers, you’re looking at 1-5ms on a good day, 50-200ms when things get bumpy.

That number compounds. If loading a user’s watchlist requires calls to the User Service, the Catalog Service, and the Watchlist Service before you can return a response, you’ve stacked those latencies. Three sequential calls at 5ms each add 15ms to every request. Three sequential calls where one is slow add much more.

In a monolith, watchlistRepository.findByUserId(userId) returns in microseconds. The query hits PostgreSQL, everything lives in the same JVM, no serialization, no HTTP overhead. Your p99 latency on that operation might be 2ms under load.

In a distributed system, that same operation might involve an HTTP call from the Watchlist Service to the Catalog Service to enrich titles with poster URLs, a Redis cache check in the Recommendation Service, and an async write to an audit log. Even on a fast internal network, you’re not getting under 10ms on the happy path.

This isn’t a reason to avoid microservices. It’s a cost you need to know you’re paying.

1.2.2 Partial Failures

In a monolith, a failure is usually binary. The process is up or it isn’t. A database connection pool is exhausted or it isn’t.

In a distributed system, partial failure is the normal operating mode. A service can be up but slow. It can be responding to some requests and dropping others. It can be healthy from the load balancer’s perspective but returning errors for a specific operation.

The real problem isn’t that services fail. It’s that failures propagate. A team I know had a Recommendation Service that started timing out under load, 30 seconds instead of the usual 50ms. Their API Gateway was configured to wait for all service responses before returning, so every page load started timing out too. Their Catalog Service was fine. Their Watchlist Service was fine. But because the Recommendation Service was struggling, their entire product was down.

That’s the cascade failure pattern. One slow service takes down every caller that depends on it, because each caller is holding a thread or a connection while it waits. A monolith doesn’t have this problem. The broken method returns an error immediately; nothing else waits.

You solve cascade failures with circuit breakers, timeouts, bulkheads, and fallbacks. All of that is real engineering work that doesn’t exist in a monolith. You’re not just choosing an architecture; you’re choosing an entire category of operational concerns.

1.2.3 Operational Complexity

Deploying a monolith: one artifact, one docker run, one set of logs, one process to monitor.

Deploying eight microservices: eight CI/CD pipelines, eight Docker images, eight Kubernetes deployments, eight sets of logs in different pods, eight health checks, eight sets of environment variables, eight database migrations that need to stay compatible with each other.

And those eight services aren’t independent in practice. They have dependencies. Catalog Service must be up before Recommendation Service can warm its cache. User Service must be up before anything that checks authentication. If you deploy them in the wrong order, things break in non-obvious ways.

Add infrastructure: an API Gateway, a service registry, a distributed tracing system (because you need to correlate requests across eight services), a secrets manager (because eight services each need credentials), and a config service (because application.yml is now eight files that need to stay consistent).

The operational surface area is roughly eight times larger. So is your on-call burden.

1.2.4 Eventual Consistency

In a monolith with a single PostgreSQL database, a transaction is either committed or it isn’t. You can SELECT immediately after an INSERT and see the data. That’s strong consistency, and it’s what most developers expect.

In a microservices architecture, each service owns its own database. That’s not optional; it’s the point. If services share a database, you haven’t built microservices, you’ve built a distributed monolith with all the costs and none of the isolation.

When data lives in multiple databases, keeping it consistent requires synchronization. In CinéTrack, when a user marks a movie as watched, that event needs to update their Watchlist Service record and eventually propagate to the Recommendation Service to update their taste profile. There’s a window, possibly a few hundred milliseconds, possibly a few seconds if Kafka is under load, where those two views are inconsistent.

Eventual consistency means the system will converge to a consistent state, given time. For most features that’s fine. For some it isn’t. Payment systems, inventory counts, anything where two concurrent operations can produce an incorrect total: these require careful design that goes well beyond what a single database transaction gives you.

1.2.5 Debugging Is Hard

In a monolith, you can add a breakpoint and step through the entire request. The stack trace tells you exactly where something failed. One log stream, one thread, one process.

In a distributed system, a single user request might touch five services. When it fails, you have five log streams to correlate, timestamps that are only as accurate as your NTP configuration, and no single point of truth about what happened in what order.

Distributed tracing tools like Zipkin or Tempo give you a unified view, but they require instrumentation, they add latency to every request, and they add another piece of infrastructure to operate. Even with perfect tracing, debugging a subtle race condition across three services at 2am is a different experience from debugging the same logic in a monolith.

Warning

The distributed systems tax is not a reason to never use microservices. It’s a reason to be deliberate. You should pay this tax knowing exactly what you’re getting in return. In sections 1.5 and 1.6 we’ll look at the signals that tell you the trade is worth it.

None of this is meant to discourage you. It’s meant to make the decision honest. The conference talks and the case studies from Netflix and Uber focus on the benefits of microservices, because Netflix and Uber needed microservices and got those benefits. Most teams aren’t Netflix. The next section is about what to do instead.

1.3 When the Monolith Wins

The word “monolith” has somehow become an insult. Engineers use it the way they use “legacy”: as a shorthand for something old and embarrassing that needs to be replaced. That framing is wrong, and it’s cost a lot of teams a lot of time.

A monolith is not a bad architecture. It’s an appropriate architecture for a specific set of circumstances. Those circumstances apply to more teams than you might think.

1.3.1 Team Size

Here’s a number that doesn’t appear in enough microservices discussions: ten. If your team has fewer than ten engineers actively working on the same codebase, microservices will hurt more than they help.

Microservices exist to let teams move independently. Different teams can deploy, scale, and evolve their services without coordinating with each other. That independence is the primary benefit. But if there’s only one team, or two teams of three, that benefit doesn’t exist. You’re paying the full distributed systems tax for an organizational benefit that your organization doesn’t need.

With five engineers, microservices don’t give you deployment independence. They give you five times more things to break, five more CI/CD pipelines to maintain, and a distributed tracing setup that three of your five engineers have never used before.

Shopify ran as a monolith until they had hundreds of engineers. Stack Overflow still runs as one, handling 1.3 billion pageviews per month. Basecamp was famously built and maintained by a small team on Rails. These aren’t failures of imagination. They’re correct engineering decisions.

1.3.2 Traffic

“We’ll need to scale” is one of the most expensive assumptions in software engineering. Scale to what, exactly? Based on what projections?

If your entire application fits comfortably on one reasonably sized server, you don’t have a scaling problem. Vertical scaling, adding more CPU, more RAM, a better instance type, is cheaper and simpler than horizontal scaling for most workloads. PostgreSQL can handle tens of thousands of transactions per second on decent hardware. A well-tuned Spring Boot application on a 32-core machine can handle enormous throughput.

The scenario where microservices’ per-service scaling actually pays off requires one component to need dramatically more capacity than the others, and requires that imbalance to be large enough to justify the decomposition cost. In CinéTrack, that might eventually be true: the Catalog Service gets far more read traffic than the Review Service. But “eventually” is doing a lot of work in that sentence. If you’re not at the scale where the imbalance is measurable and painful, you’re not at the scale where per-service scaling is worth it.

1.3.3 Early-Stage Products

Building a startup on microservices is almost always wrong. Not sometimes wrong. Almost always.

The reason: microservices require you to know your service boundaries before you build. Service boundaries come from understanding your domain deeply. Domain understanding comes from running the product and watching how users actually use it. You don’t have that on day one.

If you start with eight microservices and discover six months later that your initial domain model was wrong, you’re refactoring eight services, eight databases, eight contracts, and eight deployments. If you start with a well-structured monolith and discover the same thing, you’re moving packages and refactoring a few database tables.

Build the monolith. Learn the domain. Extract services when you have concrete evidence that extraction solves a real problem.

1.3.4 The Modular Monolith

The modular monolith deserves more attention than it gets. The idea is simple: structure your monolith with the same domain boundaries you’d use in microservices, but keep it as a single deployable unit.

In a modular monolith, each domain module has its own package hierarchy, its own database schema (or at least its own table prefix), and its own public API: a set of interfaces that other modules can call. Direct field access and cross-module table joins are banned by convention or enforced by architecture tests. The User module never reaches into the Catalog module’s tables.

This gives you most of the organizational benefits of microservices, none of the network complexity, and a clear migration path if you ever need to extract a module into a real service. The module boundaries you’ve maintained become your service boundaries. The migration is a matter of adding network calls where there were method calls.

Sam Newman, who literally wrote the book on microservices, has said that he considers the modular monolith the sensible default for most teams. That’s not a hedge; that’s a strong position from someone who deeply understands the costs.

1.3.5 Anti-Patterns to Name

Distributing for scale you don’t have. Engineering for 1,000x current load on day one is a form of premature optimization. It’s expensive, it’s slow, and the load might never arrive.

Microservices as org chart flattery. This one is subtle. Sometimes teams propose microservices not because the architecture fits the problem, but because “we run microservices” sounds good in hiring materials and engineering blog posts. The architecture becomes a status signal rather than a technical decision. The outcome is a distributed system with all the costs and an org chart that doesn’t actually benefit from the independence.

The distributed monolith. This is the worst outcome: you decomposed your application into services, but they all share a database, or they’re coupled so tightly through synchronous calls that you can’t deploy one without coordinating with all the others. You’ve paid the full distributed systems tax and gotten zero organizational benefit. You can recognize a distributed monolith by the fact that deploying Service A requires deploying Services B, C, and D in a specific order. If that’s your situation, you’d be better off with a real monolith.

A monolith you understand, can deploy in five minutes, and can debug with a single log file is better than a distributed system you can’t operate. The goal is to ship software that works. Choose the architecture that makes that most likely for your team, at your scale, with your actual organizational structure.

The next question, then, is: if you do have the scale and the org structure to justify microservices, how do you find the right boundaries?

1.4 Conway’s Law and Team Topology

In 1967, Melvin Conway published a paper that most engineers have heard quoted but few have actually read. His observation was this: organizations that design systems are constrained to produce designs that are copies of their communication structures.

That’s the formal version. The practical version: your architecture will eventually look like your org chart, whether you want it to or not.

1.4.1 Why Conway’s Law Wins

When two teams share ownership of a service, those teams need to coordinate every change. Shared ownership means shared deployments, shared planning meetings, shared code reviews, shared on-call rotations. The coordination overhead grows with every interaction.

Human beings solving coordination overhead do the natural thing: they minimize the surface area of coordination. They agree on an interface and stop talking across it. They build things they can own end-to-end. They stop depending on that other team’s code.

And so the service boundary emerges, not from a careful domain analysis, but from organizational friction. Two teams that don’t communicate well will produce two components that don’t communicate well. Those components become services. The service boundaries mirror the team boundaries.

This isn’t bad. It’s physics. You can’t fight it sustainably. What you can do is use it intentionally.

1.4.2 The Inverse Conway Maneuver

The inverse Conway maneuver flips the direction. Instead of letting your org structure determine your architecture, you design the architecture you want and then organize your teams to match it.

If you’ve decided that Catalog and Watchlist should be independent services with a clean API boundary, you put one team on Catalog and a different team on Watchlist. The organizational separation enforces the technical separation. Those two teams have no reason to reach into each other’s databases or call each other’s private methods because they don’t share a codebase.

This is why microservices and team structure are inseparable conversations. You can’t design a service boundary without asking: who owns this service? And you can’t answer that without asking: what team owns this service end-to-end?

“End-to-end” is important. A team that owns a service should be able to design it, build it, test it, deploy it, and operate it in production without waiting on anyone else. If they can’t, the service boundary is wrong or the team structure is wrong.

1.4.3 Team Topology Patterns

Matthew Skelton and Manuel Pais systematized these ideas in Team Topologies. They describe four team types, and understanding them helps you design both your architecture and your organization.

Stream-aligned teams own a product flow end-to-end. In CinéTrack, a stream-aligned team might own the “discovery and browsing” flow: the Catalog Service, the Search Service, and the Recommendation Service. They handle everything from the API contract to the database schema to the on-call rotation. They deploy independently. They set their own roadmap within product constraints.

Platform teams build the internal infrastructure that stream-aligned teams use. Kubernetes configuration, CI/CD pipelines, observability tooling, secret management: all of this is invisible complexity that every service needs but no stream-aligned team should have to maintain themselves. A platform team owns that shared platform and exposes it as a product to the stream-aligned teams. They measure their success by how little the stream-aligned teams think about infrastructure.

Enabling teams exist temporarily to help stream-aligned teams adopt new practices. If your organization is migrating to distributed tracing, an enabling team might spend a quarter working alongside each stream-aligned team to get instrumentation right, then step back. They transfer knowledge, they don’t own services.

Complicated-subsystem teams own services that require specialized expertise. If your Recommendation Service runs a machine learning model that needs ML engineers, that’s a complicated subsystem. Most teams won’t have one, but recognizing the pattern prevents you from asking your average backend engineers to maintain something that genuinely requires a specialist.

1.4.4 The Concrete Example

Here’s the anti-pattern: one team owns both the Catalog Service and the Watchlist Service. Both services are in the team’s backlog. The team leads decide it’d be convenient if the Watchlist Service could directly query the catalog’s PostgreSQL database rather than going through an HTTP API. It’s faster to build, no network call overhead.

Six months later, the Catalog Service team (which is now the same team) wants to change their database schema. But the Watchlist queries depend on the old table structure. Every schema migration now requires checking for Watchlist dependencies. The “two services” have merged at the data layer. You have a distributed monolith.

Now the opposite: one team owns Catalog, a different team owns Watchlist. The Watchlist team needs movie metadata. They open a ticket asking the Catalog team for an API endpoint. The Catalog team builds /api/catalog/titles/{id} with a defined contract. Both teams agree never to cross that boundary in any other way.

Two months later the Catalog team wants to migrate from PostgreSQL to a hybrid PostgreSQL and Redis setup for read performance. They change their internals, update their endpoint implementation, and deploy. The Watchlist team doesn’t notice. The API contract didn’t change.

That’s the benefit. Clean boundaries enforced by team structure, not by willpower.

Note

Team Topologies by Skelton and Pais is worth reading before you finalize your service decomposition. It’s one of the clearest frameworks for matching organizational structure to technical architecture.

Conway’s Law tells you that your service design and your team design are the same problem. Get the team structure wrong and the architecture will drift toward whatever your teams’ communication patterns are, regardless of what the original design said. The inverse Conway maneuver is the technique for using that force deliberately rather than fighting it.

So: you know the cost of distributed systems, you know when to avoid them, and you know that team structure drives architecture. The last question before we look at CinéTrack: how do you know when your specific situation is ready?

1.5 Microservices Readiness Signals

Most discussions of when to adopt microservices offer vague criteria. “When your monolith becomes hard to manage.” “When teams start stepping on each other’s toes.” These aren’t signals; they’re feelings. Feelings are easy to rationalize.

Here are three concrete signals. When you see them, decomposition is solving a real problem. Before you see them, decomposition is creating problems you don’t have yet.

1.5.1 Signal 1: Independent Deployability Is Blocked

You’re ready to deploy a change to the Review feature. It’s done, tested, reviewed. But it requires a coordinated release with the User team because both touch a shared user_reviews table in the same database, and the migration has to run in a specific order. So you wait. You schedule the release for the next deployment window. Your change sits in a branch for three days.

This is the signal. Independent deployability blocked by shared code or a shared database is the most direct indicator that microservices would give you something real: the ability for each team to deploy on their own cadence without coordinating with others.

Measure it. Track how often a deployment requires more than one team to coordinate. If that number is high and growing, you’re hitting a real organizational scaling limit.

Note the specificity: it’s not “our deployments are slow” or “our codebase is big.” It’s that teams are blocked on each other at the deployment step, repeatedly, in ways that slow down delivery. That’s the signal.

What it looks like in CinéTrack: The original monolith had User, Catalog, Watchlist, and Review features all deploying together. Any bug fix in Review required a full regression pass across User and Catalog, because they shared domain logic and a database connection pool. Separate services mean separate deployments.

1.5.2 Signal 2: Data Autonomy Is Violated

A service reaching into another service’s database tables directly is one of the clearest signs that your current boundaries are wrong and that you’re paying the operational cost of coupling without the benefit.

This shows up as cross-service JOIN queries. Service A’s repository layer contains a SELECT * FROM service_b_table JOIN service_a_table WHERE .... Maybe it started as a “temporary” convenience. It never stayed temporary.

Once you have cross-service database access, schema changes in one service break queries in another. Migrations require coordination. You can’t swap Service B’s database technology because Service A is now tightly coupled to its schema.

The right answer is an API call, a Kafka event, or a local denormalized read model. Any of those is better than direct table access, even if each has a cost. Service B’s data is Service B’s data. No one else should be able to read it except through a contract B explicitly publishes.

What it looks like in CinéTrack: The Watchlist Service needs the title and poster URL for each item in a user’s watchlist. In the monolith, it just queried the movies table. In microservices, it calls the Catalog Service’s API. That’s one extra network hop. It’s worth it, because now the Catalog team can evolve their schema independently.

1.5.3 Signal 3: Scaling Mismatch

Your Review Service handles maybe 200 writes per day. Users don’t post that many reviews. Your Catalog Service handles 50,000 reads per day from search, browse, and recommendation queries. In the monolith, they share a JVM, a thread pool, and a database connection pool. To give Catalog enough resources to handle its load, you have to over-provision the entire application.

When one component needs dramatically more capacity than another, and when that imbalance is large enough to matter in your infrastructure costs, per-service scaling pays off. You can run ten instances of the Catalog Service and two instances of the Review Service. You can put the Catalog Service’s database on read replicas without giving Review a read replica it doesn’t need.

The key word is “dramatically.” A 2x difference doesn’t justify the overhead. A 50x difference might. The imbalance also has to be real and measurable, not projected. If you’re scaling preemptively for traffic you’re hoping to get, you’re not responding to a readiness signal; you’re borrowing against an uncertain future.

1.5.4 What Isn’t a Signal

“Our monolith is too big.” Size is a proxy metric for the real problem, which is coordination overhead or coupling. A 200,000-line codebase with clear module boundaries and clean ownership is fine. A 20,000-line codebase where every change requires touching twenty files spread across six unmaintained packages is not. Fix the structure. Decompose if and when the three signals above appear.

“We want to use different tech stacks.” You can use Kotlin for one service and Java for another. You can run Go for a performance-sensitive component. That’s a real benefit of microservices. But it’s not a reason to adopt microservices. It’s a side effect. Polyglot diversity in services has a cost too: more toolchain expertise required, more CI/CD configuration, harder onboarding for new engineers. Adopt it when it solves a specific bottleneck, not as a default.

“Microservices are the industry standard.” This is social proof, not technical reasoning. The industry standard for consumer internet companies at scale is microservices. The industry standard for enterprise line-of-business applications is still largely monolithic. And the industry standard for startups with fewer than fifty engineers is increasingly recognized to be modular monoliths, not microservices. “Industry standard” is a shortcut for not thinking about your specific situation.

1.5.5 CI/CD Maturity as a Prerequisite

One more hard requirement: you need mature CI/CD before microservices make sense. Not “we have Jenkins somewhere.” Mature. Each service needs its own automated build, automated test suite, automated deployment to staging, and ideally automated deployment to production on green builds.

Without that, microservices don’t give you deployment independence. They give you eight services that are all harder to deploy manually than one monolith was. The coordination overhead you were trying to eliminate comes back as deployment ceremony.

If your current monolith takes four hours and two engineers to deploy, fix that first. A fast automated deployment pipeline for a monolith is a useful thing in itself, and it’s the foundation you’ll need for per-service deployments anyway.

Once you see the signals, and your CI/CD is ready, the question becomes: where exactly do you draw the boundaries? That’s what CinéTrack will show us.

1.6 Our CinéTrack System

If you worked through the Spring Boot 4 book, you built CinéTrack as a monolith. One application, one database, all features packaged together. It worked. It was deployable in a single command, debuggable in a single log stream, and understandable as a single codebase.

Now we’re going to take it apart.

Not because the monolith was wrong. It was the right starting point. We now understand the domain well enough to draw real boundaries, we have concrete scaling evidence from running it in production, and we have the team structure to support independent ownership. Those are the three conditions that make decomposition worthwhile.

1.6.1 The Eight Services

Here’s what CinéTrack looks like decomposed. For each service, I’ll tell you what it owns, why it’s a natural boundary, and what events it publishes or consumes.

API Gateway runs Spring Cloud Gateway. It’s the single entry point for all external traffic. It handles routing, rate limiting, and JWT validation before requests reach any downstream service. The gateway doesn’t have its own database. It holds no state. It’s a routing and enforcement layer, not a domain service.

User Service owns identity and authentication. It runs Spring Authorization Server backed by PostgreSQL and handles registration, login, profile management, and OAuth2 token issuance. Everything about who a user is and whether they’re authenticated lives here. It publishes a UserRegistered event to Kafka when a new account is created, and a UserDeactivated event when an account is removed. No other service reads the users table directly.

Catalog Service owns movie and TV show metadata. It pulls data from The Movie Database API, caches it in Redis, and persists it to PostgreSQL. Browse, search by genre or release year, and title detail all go through this service. It publishes TitleAdded and TitleUpdated events when its catalog changes. This is the highest-read-volume service in the system: roughly 80% of traffic hits Catalog. That imbalance is one of the concrete reasons to run it as a separate service.

Watchlist Service owns the relationship between users and titles. Adding a title to “want to watch,” marking it as watched, removing it from a list: all Watchlist. It stores user IDs and title IDs in PostgreSQL and enriches them with title metadata from the Catalog Service’s API at read time. It publishes TitleWatched and WatchlistUpdated events, which the Recommendation Service consumes.

Review Service owns written reviews and star ratings. Users write one review per title. Reviews are moderated before they’re publicly visible. The Review Service has its own PostgreSQL database with a reviews table that no other service touches. It publishes ReviewPublished events when a review passes moderation.

Recommendation Service maintains a read-optimized model of each user’s taste profile in Redis. It consumes TitleWatched events from the Watchlist Service and ReviewPublished events from the Review Service to update a user’s preferences. It doesn’t write to any relational database. Its only job is to answer the question: “What should this user watch next?” quickly. This is a CQRS read model: it’s optimized purely for reads, built by processing write events from other services. We’ll explore this pattern in depth in Chapter 10.

Notification Service sends emails. Registration confirmation, watchlist reminders, review moderation results: all routed through here. It consumes Kafka events from other services and uses SMTP to deliver email. It has a small PostgreSQL database for delivery tracking and preference settings. It doesn’t expose an HTTP API to other services. It only listens to events.

Search Service runs Elasticsearch and handles full-text search across titles and reviews. The Catalog Service and Review Service publish events that the Search Service consumes to keep its index current. When a user searches for “80s sci-fi with time travel,” that query goes to the Search Service, not to the Catalog Service’s PostgreSQL database. Elasticsearch handles the text relevance ranking that PostgreSQL’s LIKE queries can’t.

1.6.2 Why These Boundaries

None of these boundaries are arbitrary. Each service maps to a domain that has its own data, its own rate of change, and a clear ownership story.

User and Catalog change for completely different reasons. User Service changes when authentication requirements evolve: a new OAuth2 provider, a new profile field, a change to how sessions work. Catalog Service changes when TMDB integration changes or when we add new browsing features. These have nothing to do with each other. Bundling them together would mean deploying authentication changes every time we update how movies display.

Recommendation Service is separate because it has completely different infrastructure requirements. It needs a fast in-memory store, it processes events asynchronously, and it never needs strong consistency. Putting it in the same service as anything that needs relational transactions would be the wrong tool for both problems.

Notification Service is isolated by design. Notifications are fire-and-forget. They should never block a request, they can tolerate significant latency, and they have a delivery tracking concern (did this email actually send?) that doesn’t belong in any other domain. It also needs to stay isolated for security reasons: SMTP credentials and email templates don’t belong in a service that also handles movie metadata.

1.6.3 The Key Events

The services are connected by Kafka events. Here’s the primary flow when a user marks a movie as watched:

User clicks “Mark as Watched” in the client.
Request hits API Gateway, JWT is validated.
Watchlist Service receives PATCH /watchlist/items/{titleId}, updates the record, publishes TitleWatched to Kafka.
Recommendation Service consumes TitleWatched, updates the user’s Redis taste profile.
Notification Service consumes TitleWatched, checks if the user wants “completed” notifications, sends email if configured.

The Watchlist Service doesn’t know the Recommendation Service exists. It publishes an event describing what happened and moves on. The consumers decide what to do with it. This is loose coupling: adding a new consumer of TitleWatched events (say, a social sharing feature) requires no changes to the Watchlist Service.

1.6.4 What Decomposition Costs Us

Let’s be direct. Compared to the monolith, this architecture is harder to run locally. You’ll need Docker Compose with eight containers instead of one. Integration tests become slower and more complex.

Debugging a user’s “mark as watched” flow now requires looking at logs in the Watchlist Service and the Recommendation Service and potentially Kafka’s consumer lag metrics. In the monolith, you had one stack trace.

Schema migrations require more care. Changing what the TitleWatched event looks like requires coordinating with every consumer. You can’t just change a method signature and let the compiler tell you what broke.

We’re accepting those costs because the system we’re building has clear organizational boundaries, documented scaling imbalances, and concrete deployment friction in the monolith that justified the decomposition. Not as an exercise. As the genuine outcome of running the monolith long enough to see those signals.

Throughout the rest of this book, we’ll build each of these services, wire them together, and handle every failure mode that shows up along the way. Chapter 2 starts with Domain-Driven Design: the formal tool for finding service boundaries that won’t turn into distributed monoliths six months later.

1.7 Common Mistakes

Decomposing before deployment pain is real. The most common mistake. A team reads about microservices, looks at their growing codebase, and decides to decompose proactively. But the three readiness signals aren’t there yet. The result: all the operational complexity of microservices with none of the organizational independence, because the team structure didn’t change and the deployment problems weren’t solved by decomposition.
Building a distributed monolith. Services that share a database aren’t microservices. They’re the worst of both worlds: network calls between services, plus all the coupling of a shared data model. Any service that runs a JOIN across another service’s tables is not actually a separate service. Enforce data ownership early. It’s far harder to untangle later.
Nano-services. A UserNameService that handles only the user’s display name, a separate UserEmailService for their email address: these are too small. Every function call that could be a method invocation is now a network round-trip. Services should encapsulate a complete subdomain, not a single responsibility in the single-responsibility-principle sense of a class. A good heuristic: if the service has fewer than three or four database tables, it’s probably too small.
Ignoring Conway’s Law. A team draws up a beautiful service diagram with eight well-bounded services, then assigns all eight to the same four engineers. Those engineers, rationally minimizing their own coordination overhead, start reaching across service boundaries. Direct database queries appear. Shared libraries grow until they’re almost a shared codebase. The boundaries dissolve. Architecture follows team structure. If you don’t change the team structure to match the target architecture, the architecture will drift back toward whatever the team’s communication patterns require.
Skipping the modular monolith step. Building microservices before you’ve built a modular monolith means building microservices before you understand your domain boundaries well enough to get them right. The modular monolith is the safe middle ground: you practice drawing boundaries, you maintain them under real feature development, and you discover which boundaries are wrong cheaply, before you’ve encoded them as separate deployable services with separate databases.
Treating microservices as a technical decision rather than an organizational one. The central point, repeated once more: microservices give you independent deployability for independent teams. If you don’t have independent teams, you don’t get the benefit. Microservices are an organizational architecture that happens to have a technical implementation. Choose the org structure first, then let the service boundaries follow.

1.8 Summary

The distributed systems tax is real and non-negotiable. Network latency, partial failures, operational complexity, eventual consistency, and distributed debugging: every microservices deployment pays all of these, always. Know the cost before you commit.
The monolith is a correct choice for most teams at most stages. Fewer than ten engineers, traffic that fits one server, a domain you don’t fully understand yet: all point to a monolith. The modular monolith gives you clean boundaries without the operational overhead.
Conway’s Law is a force of nature, not a guideline. Your architecture will converge toward your team communication structure over time. Use the inverse Conway maneuver deliberately: design the architecture you want, then organize teams to match it, one team per service.
Three concrete signals tell you when decomposition is worth it. Independent deployability blocked by cross-team coupling. Data autonomy violated by cross-service table access. Scaling mismatch where one component needs dramatically more capacity than others. See these signals; then decompose. Before you see them, stay monolithic.
CinéTrack’s eight services each own a clear subdomain with its own data, its own rate of change, and its own scaling characteristics. The decomposition came from evidence, not preference.
Next: Chapter 2 maps CinéTrack’s domain using Domain-Driven Design to find the right service boundaries, and gives you the tools to do the same for your own system.