← Back to Blog

Premature Abstraction Is Worse Than Duplication

DRY is right, but not yet. Three identical code blocks are better than one wrong abstraction that fights you for months. Wait until the pattern is obvious.

Early in my career, I was reviewing a pull request from a teammate. He had two functions that were almost identical. They both fetched a user, checked permissions, then called a slightly different service method. Each returned a formatted response. About 80% of the code was the same. I left a comment: "These should be one function with a parameter."

He merged my suggestion. Three months later, that function had nine parameters, four boolean flags, three conditional branches in the middle, and a 40-line switch statement. It handled user creation, user updates, admin overrides, batch imports, and a guest checkout flow that had nothing to do with users. Nobody could modify it without breaking something else. Nobody could read it without tracing through every branch to understand which path their change would take.

That function was my fault. I asked for it. I saw duplication and reached for DRY like a reflex. And I turned two simple, readable functions into one unmaintainable monster.

DRY Is a Principle, Not a Command

Don't Repeat Yourself is one of the first things we learn as programmers. It sounds absolute. Repetition is waste. Duplication is a bug waiting to happen. If you change one copy and forget the other, you have an inconsistency. The logic is clean and hard to argue with.

The problem is that DRY answers the wrong question. It asks "do these two pieces of code look the same?" when it should ask "do these two pieces of code change for the same reasons?"

Two blocks of code can look identical today and evolve in completely different directions tomorrow. The user creation flow and the guest checkout flow happened to share the same steps when I reviewed that PR. But they served different purposes, had different stakeholders, and would diverge the moment product requirements changed. And they did diverge. Immediately. But by then they were welded together in a single function, and every change to one flow risked breaking the other.

Sandi Metz put it better than I can: "duplication is far cheaper than the wrong abstraction." I didn't understand that sentence until I lived it. Now I think about it almost every day.

How Wrong Abstractions Happen

Nobody sets out to build a bad abstraction. They happen gradually, and the process always follows the same pattern.

Step 1: You see two similar things. Two API endpoints that both validate input, call a service, and return a response. Two React components that both fetch data and render a list. Two SQL queries that both join the same three tables.

Step 2: You extract the common parts. You create a shared function, a base class, a generic component, a utility method. It takes the parts that differ as parameters. It's clean. It's elegant. You're proud of it.

Step 3: A third case shows up. It's mostly like the first two, but with a twist. The shared function almost works, but not quite. You add a parameter. Maybe a boolean flag. includeArchived or skipValidation. It's a small change. The abstraction still feels right.

Step 4: Cases four, five, and six show up. Each one is slightly different. Each one needs another flag, another parameter, another conditional branch. The function signature grows. Internal logic becomes a maze of if-else chains. Simplicity is gone, but you can't go back because everything depends on this shared function now.

Step 5: Nobody wants to touch it. New team members look at the function and can't understand what it does. Bug fixes take three times longer because you have to trace every code path. The abstraction that was supposed to save time is now the biggest time sink on the team.

I've watched this happen at least a dozen times across different projects, different teams, different languages. The pattern is always the same. It starts with good intentions and ends with a function that everyone hates and nobody can replace.

The Boolean Flag Smell

Here's a heuristic I've developed over the years. If your shared function takes a boolean parameter that changes its behavior, you probably have two functions pretending to be one.

public OrderResponse processOrder(Order order, boolean isGuestCheckout) {
    if (isGuestCheckout) {
        // skip user validation
        // use temporary session
        // don't save to order history
    } else {
        // validate user exists
        // use authenticated session
        // save to order history
    }
    // ... shared payment logic ...
    if (isGuestCheckout) {
        // send guest confirmation email
    } else {
        // send registered user email
        // update loyalty points
    }
}

This function does two completely different things depending on a boolean. The "shared" part in the middle is the payment logic, which could be extracted into its own function. Instead, we've jammed two workflows into one method and toggled between them with a flag.

Two separate functions, processGuestOrder and processRegisteredOrder, would each be half the length, twice as readable, and independently modifiable. The shared payment logic becomes a helper they both call. You get code reuse without the coupling.

Every time I see a boolean parameter in a function signature, I ask: "Would this be clearer as two functions?" The answer is yes more often than you'd expect.

The Inheritance Trap

Class inheritance is the most seductive form of premature abstraction, and the hardest to undo.

I worked on a Java codebase that had a BaseController class. Every API controller extended it. It started small: shared error handling, common authentication checks, standard response formatting. Reasonable stuff.

Over two years, BaseController grew to 800 lines. It had methods for pagination, caching, rate limiting, audit logging, feature flags, A/B test assignment, and request tracing. Some controllers used all of these features. Most used two or three. But every controller inherited all of them.

When we needed to change the caching behavior for one endpoint, we had to make sure the change didn't affect the other 40 controllers that inherited the same method. When a new engineer joined and asked "how does authentication work in this controller?", the answer was "look at the base class, but also the base class's base class, and also this interface it implements." The inheritance hierarchy was four levels deep.

Composition would have solved this. Instead of one base class with everything, you'd have small, focused objects: an AuthHandler, a Paginator, a RateLimiter. Each controller uses only what it needs. No inheritance chain to trace. No accidental coupling between unrelated behaviors.

I've seen this pattern in every language that supports inheritance. Java's AbstractBaseService. Python's mixin chains. Ruby's module includes stacked six deep. The language changes, but the mistake is the same: using inheritance to share code when all you really need is a function call.

When to Abstract

So if early abstraction is dangerous, when is the right time? I use the Rule of Three, but with a caveat.

The classic Rule of Three says: the first time you write something, just write it. Second time, wince at the duplication but leave it. By the third time, extract the abstraction.

My version adds a question: are all three cases evolving in the same direction? If the three pieces of duplicate code are in the same domain, serve the same purpose, and change for the same reasons, abstract them. If they happen to look similar but serve different parts of the product, leave them alone.

Here's a real example. I had three API endpoints that all did the same input validation: check that the request body isn't null, validate that required fields are present, parse dates from strings. Same logic, copy-pasted three times. These were all validation. They'd change together if the validation rules changed. I extracted a shared validator. Two years later, it's still clean and unchanged.

Same project, different situation. I had three endpoints that all fetched a user, checked a permission, and returned a 403 if unauthorized. Same pattern, same code. But one was for a public API, one was for an internal admin tool, and one was for a batch processing pipeline. The permission models for these three contexts were about to diverge in fundamental ways, and I could already see it in the product roadmap. I left the duplication. Six months later, the public API had OAuth scopes, the admin tool had role-based access, and the batch pipeline had service account tokens. If I'd abstracted the permission check, I would have been ripping it apart six months later.

The question isn't "do these look the same?" It's "will these stay the same?"

Signs You Abstracted Too Early

If you're not sure whether an existing abstraction is premature, here are the symptoms I look for.

The function takes more than four parameters. Not a hard rule, but a signal. Long parameter lists usually mean the function is trying to handle too many different cases.

You have boolean flags that change behavior. As I mentioned above, this almost always means you have multiple functions crammed into one.

New requirements require modifying shared code. If every new feature touches the same shared utility, that utility is a bottleneck, not a convenience. Good abstractions are extended, not modified.

You need to understand the callers to understand the function. If you can't read a function in isolation and know what it does, if you need to check who calls it and with what arguments to understand its behavior, the abstraction is leaking.

Tests for the shared code are complicated. If testing the abstraction requires elaborate setup, mocking, or testing matrix combinations of parameters, the abstraction is doing too much. Simple functions have simple tests.

The Cost of Duplication Is Lower Than You Think

We treat duplication like a cardinal sin, but in practice, it's cheap. If you have three copies of the same validation logic and you need to update it, you update it three times. That takes an extra five minutes. Modern IDEs can find all usages. Grep can find all copies. It's annoying, not dangerous.

A wrong abstraction costs weeks. You spend time understanding it. Time working around it. Time in code reviews arguing about it. Eventually you spend time replacing it, which means untangling every caller and testing every code path. I've seen wrong abstractions survive for years because the cost of removing them was higher than the cost of suffering through them.

Five minutes of copy-paste versus weeks of refactoring. The math is clear. But it doesn't feel that way in the moment, because duplication feels wrong and abstraction feels smart. That's the trap.

What I Do Now

When I see duplication in code I'm writing or reviewing, I ask myself a series of questions before reaching for the extract-method refactoring.

How many times is this duplicated? If it's twice, I almost always leave it. Two is not a pattern. Two is a coincidence.

Do these cases serve the same purpose? If the duplicate code exists in different domains, different features, or different layers of the application, I leave it. Similar code in different contexts is not the same as repeated code in the same context.

Will these cases change together? If a change to one copy should always be applied to the other copies, that's a strong signal for abstraction. If they might diverge, duplication is cheaper.

Can I name the abstraction? If I can give the shared function a clear, specific name that describes what it does without using words like "handle," "process," or "common," it's probably a real abstraction. If the best name I can come up with is handleCommonLogic or processShared, I don't have an abstraction. I have a grab bag.

I'm not against abstraction. Abstraction is what makes software engineering possible. I'm against premature abstraction, the kind that happens before you understand the problem well enough to know where the real boundaries are. Wait longer than feels comfortable. Let the duplication sting a little. The right abstraction will reveal itself when you have enough examples to see the actual pattern, not the pattern you imagined after seeing two cases.

Your future self will thank you for the duplication. Your future self will curse you for the wrong abstraction. Trust me on this one. I've been both of those future selves.

Share
X LinkedIn HN
UI

Umur Inan

Principal Software Engineer

Backend engineer focused on JVM systems, distributed architecture, and the failure modes that only show up in production. I write about what I learn building and breaking things at scale.

👁 0 9 min read

Comments (0)