You've seen the error. PG::ConnectionBad: FATAL: sorry, too many clients already. Your app is down, Postgres is refusing new connections, and you're staring at a number in database.yml wondering how you got here.
Here's what most people miss: you probably did the math right. You just did it for the wrong moment in time.
The Math Feels Simple
Every Rails process holds a pool of database connections. You configure this with pool in database.yml, and it caps how many connections a single process can open. The back-of-the-napkin calculation looks like this:
total connections = number of processes × pool size
So if your Postgres max_connections is 100, and you have a handful of processes each with a pool of 5, you're fine. You stay under the limit, you ship it, you move on.
The problem is that calculation describes a snapshot — a single, stable moment when everything is running and nothing is changing. Production is rarely that moment.
Where the Real Pressure Comes From
Connection pressure doesn't usually come from steady-state traffic. It comes from transitions: deployments, restarts, process crashes, autoscaling events. These are the moments when your process count isn't the number you planned for — it's temporarily higher.
The classic trap is a rolling deployment. You spin up new processes to replace old ones. For a window of time — sometimes just seconds, sometimes longer depending on your setup — both generations are alive simultaneously. Your connection count doubles. If you were running close to the limit already, that's enough to tip you over.
Sidekiq makes this worse. A Sidekiq process with high concurrency (say, 20 threads) can hold 20 connections simultaneously if all threads are doing database work at once. Multiply that by a few workers and the numbers climb fast.
What Happened to Us
We had a Postgres instance with max_connections set to 100. Our setup was:
- 1 web server (Puma) with a connection pool of 30
- 2 Sidekiq workers, each with a pool of 30
That's 90 connections at steady state. Ten to spare. Felt fine.
Then we deployed — during the worst possible moment.
We were in the middle of onboarding a new client, which meant both Sidekiq workers were running hot. Imports, notifications, data processing jobs — the queues were fuller than they'd ever been in production. Every thread was active and holding a connection.
On a normal day, a deploy would have been fine. Our Sidekiq workers rarely saturate their pools at the same time — most threads are idle or waiting on external calls, not the database. But this wasn't a normal day.
Our platform spins up new processes before draining the old ones — a sensible strategy to avoid downtime. But during the deploy window we had:
- 2 web server processes (old + new): 60 connections
- 4 Sidekiq workers (old pair + new pair), all running at full tilt: 120 connections
That's 180 connections competing for 100 slots. Postgres started rejecting connections. The new processes couldn't boot cleanly. The deploy failed and the app went down.
The math we'd done was correct — for normal operation on a quiet day. We'd never accounted for a deploy landing during peak load.
How We Fixed It
There are a few levers here, and the right answer is usually to pull more than one.
Right-size your pool. The default pool of 5 in Rails is actually fine for most apps. A pool of 30 per process is almost always too high — threads rarely all hit the database at the same time. Bring this number down and you create headroom for transitions.
Use a connection pooler. Tools like PgBouncer sit between your app and Postgres and multiplex connections. Your app thinks it has 100 connections; Postgres only sees 20. This is the most robust fix for connection pressure at scale.
Account for peak process count, not steady state. When you set max_connections, think about deploy windows, autoscaling events, and process restarts — not just the number of processes you run on a quiet Tuesday afternoon. A rough rule: plan for 2× your normal process count as your ceiling.
Set max_connections with headroom. Postgres also reserves connections for superusers and maintenance. Leaving 10–15% of your limit unallocated prevents edge cases from turning into outages.
The Takeaway
Too many Postgres connections is rarely a traffic problem. It's almost always a process lifecycle problem — something is creating more processes than you planned for, and they all hold connection pools.
Do the math, but do it for the worst moment, not the average one. And if you're regularly running close to your connection limit, PgBouncer is worth the afternoon it takes to set up. The deploy that took us down would have been a non-event with a pooler in front of Postgres.