Postgres zero-downtime migrations: the patterns we actually use

Adding columns, renaming, changing types: all without service interruption. Four techniques we apply.

For 24/7 SaaS projects, "let's stop for 10 minutes to migrate" is not an option. Here are four techniques we actually use to change schema without stopping the service.

1. Expand-contract

The most important pattern. For every schema change:

Expand: add the new column/table without touching the old.
Deploy code that writes to both.
Backfill historical data.
Deploy code that reads from the new one.
Contract: drop the old.

Five deploys. Also zero downtime.

2. Lock-aware ALTER TABLE

In Postgres some ALTERs lock the whole table. Rules of thumb:

ADD COLUMN ... NOT NULL with DEFAULT on Postgres < 11: heavy lock. On Postgres ≥ 11: light lock.
ALTER COLUMN TYPE: often requires a full table rewrite. Avoid in production — prefer expand-contract.
CREATE INDEX: always use CONCURRENTLY.
ALTER TABLE with timeout: set statement_timeout and lock_timeout so the system doesn't stall waiting for a lock.

3. Batch backfill

Never a single UPDATE ... WHERE ... on a big table. Use a Node script processing 1k-10k rows per batch, with pauses:

while (rows = await fetchBatch()) {
  await updateBatch(rows);
  await sleep(200); // breathe
}

4. Feature flag

The new code's deploy hides behind a feature flag. If backfill is incomplete or there's a bug, you flip it off without Git rollback.

When NOT to do zero-downtime

For internal projects with users on a known schedule, a 5-minute nightly maintenance costs far less than full zero-downtime setup. Know when it is worth it.