← Back to Blog

Database Partitioning: The Decision You Can't Undo

Range vs hash partitioning, hot spots, and the re-partitioning trap. Partitioning looks like a scaling win until you find out you cannot undo the choice.

We partitioned too early. The table had about fifty million rows, which sounds like a lot until you realize PostgreSQL handles billions without breaking a sweat. But we were worried about growth, so we implemented range partitioning by date. One partition per month. It seemed clean. Logical. Future-proof.

Six months later, we discovered the problem. Our query patterns did not follow the partition key. We were looking up records by user ID, by status, by a dozen other dimensions. But the database was organized by date. So the planner scanned every partition, every time, because it had no way to eliminate them. We had turned a single table scan into thirty partition scans. Performance got worse, not better.

Here is the thing about partitioning: it is a physical storage decision that masquerades as a logical one. Once you commit to a partition scheme, your data lives in those buckets. Moving it means rewriting the table. For a table with hundreds of millions of rows and constant writes, that rewrite takes your application offline for hours. Maybe days.

The decision you make on day one determines your constraints for years. Pick range partitioning by date, and every query better filter on date or you are scanning the world. Hash partitioning costs you ordering: time-series queries need contiguous ranges and you lose them. With list partitioning, you have to know your categories upfront because adding new ones requires DDL.

We tried to fix it. The migration plan was: create a new table with different partitioning, dual-write to both, backfill historical data, switch the reads, drop the old table. Classic. It took three months. Three months of maintaining two write paths, verifying consistency scripts, and explaining to leadership why the database change was still not done.

And that assumes you can even take the downtime. If you are running a 24/7 service with customers worldwide, there is no good time for a multi-hour rewrite. You end up building elaborate online migration schemes: triggers, shadow tables, cutover scripts. The complexity of fixing your partitioning decision rivals the complexity of building the feature in the first place.

So when should you partition? Almost never at the beginning. PostgreSQL, MySQL, modern databases handle surprising scale on single tables with proper indexing. We have tables with a billion rows that perform fine because the queries hit the right indexes and only fetch what they need. The database is not the bottleneck. The queries are.

Partition when you have a specific, measurable problem. Query times are degrading and the explain plan shows it scanning too much dead space. Vacuum or maintenance operations take long enough to matter. Storage costs climb because cold data sits next to hot data and you cannot archive efficiently. Not before. Definitely not because you think you might need it someday.

And when you do partition, think hard about the key. The partition key should match your primary query pattern. If you query by user ID, partition by user ID. For date-bounded reads on recent data, partition by date. The planner can only eliminate partitions when the query includes the partition key with an equality or range filter. Missing that key means scanning everything.

Range partitioning is the most common choice for time-series data. One partition per month, per week, whatever makes sense. The benefit is partition elimination on date queries and easy archival: drop old partitions when they expire. The cost is uneven distribution. Last month gets all the writes. Older partitions sit cold. Hot spots emerge.

Hash partitioning spreads the load evenly. Hash of the key modulo partition count determines where the row lives. No hot spots, but you lose ordering. Date ranges span all partitions. Time-series queries become expensive. You cannot easily archive by dropping old partitions because they are scattered across all hash buckets.

There is also list partitioning for discrete categories, and composite approaches that nest strategies. Each option piles on complexity, planner overhead, and harder schema changes. Free, none of them are.

We ended up with hash partitioning by user ID. It fit our actual access patterns. Queries filter by user, the planner eliminates all but one partition, performance is fine. But getting there cost us six months of engineering time, false starts, and one very stressful cutover weekend.

The lesson is simple. Partitioning is not a performance optimization. It is an architectural commitment. Like choosing a service boundary or a data model, it shapes what you can do for the life of the application. Treat it with the same respect. Measure first. Prove you need it. Then choose the strategy that fits your real queries, not the ones you imagine you might have.

If you are considering partitioning, ask yourself: what is the specific query that is slow? Can you fix it with an index? Can you archive old data? Can you denormalize? Those are reversible. Partitioning is not. Once the data is split across tables, unsplitting it is a project.

We should have waited. Fifty million rows was not the problem we thought it was. The indexes we added would have bought us another year of growth. By the time we actually needed partitioning, we would have understood our query patterns. We would have chosen the right strategy from the start. The migration would have been avoided entirely.

There is a saying: premature optimization is the root of all evil. Premature partitioning is worse. It is premature optimization that requires a migration to undo. Do not be us. Measure, prove, then partition.

Share
X LinkedIn HN
UI

Umur Inan

Principal Software Engineer

Backend engineer focused on JVM systems, distributed architecture, and the failure modes that only show up in production. I write about what I learn building and breaking things at scale.

👁 0 4 min read

Comments (0)