Database Migrations | Zero-Downtime Strategies
Manage database migrations safely with versioning, testing, and rollback strategies. GitScrum tracks migrations and keeps data safe during schema evolution.
6 min read
Database migrations are the riskiest part of deployment. They touch data, can't always be undone, and affect running applications. Good migration management keeps data safe while enabling evolution. Poor management leads to outages and data loss.
Migration Safety
| Risk | Mitigation |
|---|---|
| Data loss | Backups, tested rollback |
| Downtime | Zero-downtime patterns |
| Corruption | Staging testing |
| Performance | Index analysis |
Migration Basics
Version Control
MIGRATION MANAGEMENT
ββββββββββββββββββββ
VERSION CONTROLLED:
βββββββββββββββββββββββββββββββββββββ
migrations/
βββ 001_create_users.sql
βββ 002_add_email_to_users.sql
βββ 003_create_orders.sql
βββ 004_add_index_on_orders.sql
βββ Sequential, tracked
MIGRATION FILE:
βββββββββββββββββββββββββββββββββββββ
-- Migration: 002_add_email_to_users
-- Created: 2024-01-15
-- Author: developer@example.com
-- Up
ALTER TABLE users ADD COLUMN email VARCHAR(255);
-- Down
ALTER TABLE users DROP COLUMN email;
TOOLS:
βββββββββββββββββββββββββββββββββββββ
βββ Flyway (SQL migrations)
βββ Liquibase (XML/SQL)
βββ Laravel Migrations (PHP)
βββ Alembic (Python)
βββ Knex (Node.js)
βββ Rails Migrations (Ruby)
βββ Framework-specific tools
EXECUTION:
βββββββββββββββββββββββββββββββββββββ
βββ Run pending migrations
βββ Track executed migrations
βββ Never modify executed migrations
βββ New changes = new migration
βββ Immutable history
Zero-Downtime Patterns
Safe Changes
ZERO-DOWNTIME MIGRATIONS
ββββββββββββββββββββββββ
SAFE OPERATIONS:
βββββββββββββββββββββββββββββββββββββ
Always safe:
βββ Add new table
βββ Add nullable column
βββ Add index (concurrent)
βββ Add column with default
βββ Additive changes
DANGEROUS OPERATIONS:
βββββββββββββββββββββββββββββββββββββ
Need careful handling:
βββ Remove column
βββ Rename column
βββ Change column type
βββ Add NOT NULL constraint
βββ Require multi-step approach
βββ Never in single migration
COLUMN RENAME PATTERN:
βββββββββββββββββββββββββββββββββββββ
Goal: Rename 'name' to 'full_name'
Step 1: Add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
Step 2: Deploy code that writes to both
old_column and new_column
Step 3: Backfill data
UPDATE users SET full_name = name WHERE full_name IS NULL;
Step 4: Deploy code that reads from new_column
Step 5: Stop writing to old column
Step 6: Remove old column (later)
ALTER TABLE users DROP COLUMN name;
Timeline:
βββ Migration 1 (add column)
βββ Deploy 1 (write both)
βββ Migration 2 (backfill)
βββ Deploy 2 (read new)
βββ Deploy 3 (write new only)
βββ Migration 3 (drop old)
βββ Multiple deploys, safe change
ADDING NOT NULL:
βββββββββββββββββββββββββββββββββββββ
Goal: Make email required
Step 1: Add column nullable
ALTER TABLE users ADD COLUMN email VARCHAR(255);
Step 2: Deploy code requiring email
Step 3: Backfill existing rows
UPDATE users SET email = 'unknown@example.com'
WHERE email IS NULL;
Step 4: Add constraint
ALTER TABLE users
ALTER COLUMN email SET NOT NULL;
Testing Migrations
Before Production
MIGRATION TESTING
βββββββββββββββββ
STAGING TESTING:
βββββββββββββββββββββββββββββββββββββ
Requirements:
βββ Production-like data volume
βββ Realistic data patterns
βββ Same database version
βββ Test migration time
βββ Test rollback
βββ Production simulation
CHECKLIST:
βββββββββββββββββββββββββββββββββββββ
Before production:
βββ β Tested in staging
βββ β Migration time acceptable
βββ β Rollback tested
βββ β Backup taken
βββ β Maintenance window if needed
βββ β Team on standby
βββ Ready to execute
PERFORMANCE TESTING:
βββββββββββββββββββββββββββββββββββββ
For large tables:
βββ Estimate migration time
βββ Test on production-size data
βββ Check lock duration
βββ Plan for table size
βββ Consider batching
βββ No surprises
EXAMPLE:
βββββββββββββββββββββββββββββββββββββ
Table: orders (50M rows)
Test results:
βββ Add column: 45 seconds
βββ Backfill: 15 minutes (batched)
βββ Add index: 3 minutes (concurrent)
βββ Total: 20 minutes
βββ Plan: Low-traffic window
βββ Prepared
Rollback Strategy
Recovery Plan
ROLLBACK STRATEGY
βββββββββββββββββ
REVERSIBLE MIGRATIONS:
βββββββββββββββββββββββββββββββββββββ
Write down migration:
-- Up
ALTER TABLE users ADD COLUMN email VARCHAR(255);
-- Down
ALTER TABLE users DROP COLUMN email;
βββ Every migration has reverse
βββ Tested in staging
βββ One command to undo
βββ Easy rollback
NON-REVERSIBLE:
βββββββββββββββββββββββββββββββββββββ
Some migrations can't reverse:
βββ DROP COLUMN (data gone)
βββ Truncate data
βββ Type changes with data loss
βββ Need backup restore
βββ Special handling
FOR NON-REVERSIBLE:
βββββββββββββββββββββββββββββββββββββ
βββ Document explicitly
βββ Backup before running
βββ Test restore procedure
βββ Manual rollback script
βββ Extra caution
βββ Plan B ready
BACKUP STRATEGY:
βββββββββββββββββββββββββββββββββββββ
Before migration:
βββ Full backup or snapshot
βββ Verify backup works
βββ Know restore time
βββ Restore procedure documented
βββ Point-in-time recovery enabled
βββ Safety net
Large Table Migrations
Handling Scale
LARGE TABLE MIGRATIONS
ββββββββββββββββββββββ
PROBLEMS:
βββββββββββββββββββββββββββββββββββββ
Large tables:
βββ Long lock times
βββ Slow operations
βββ Replication lag
βββ Connection timeouts
βββ Need special handling
βββ Can't do simple ALTER
BATCHING:
βββββββββββββββββββββββββββββββββββββ
Backfill in batches:
DO $
DECLARE
batch_size INT := 10000;
last_id INT := 0;
BEGIN
LOOP
UPDATE users
SET full_name = name
WHERE id > last_id
AND id <= last_id + batch_size
AND full_name IS NULL;
IF NOT FOUND THEN EXIT; END IF;
last_id := last_id + batch_size;
COMMIT;
PERFORM pg_sleep(0.1); -- Throttle
END LOOP;
END $;
βββ Process in chunks
βββ Commit between batches
βββ Allow replication to catch up
βββ Reduce lock duration
βββ Controlled execution
ONLINE SCHEMA CHANGES:
βββββββββββββββββββββββββββββββββββββ
Tools:
βββ pt-online-schema-change (MySQL)
βββ gh-ost (MySQL)
βββ pg_repack (PostgreSQL)
βββ No locking changes
βββ Production-safe
CONCURRENT INDEX:
βββββββββββββββββββββββββββββββββββββ
PostgreSQL:
CREATE INDEX CONCURRENTLY idx_users_email
ON users(email);
βββ Doesn't lock table
βββ Takes longer
βββ Can fail (retry)
βββ Production-safe
βββ Use for large tables
GitScrum Integration
Tracking Migrations
GITSCRUM FOR MIGRATIONS
βββββββββββββββββββββββ
MIGRATION TASKS:
βββββββββββββββββββββββββββββββββββββ
βββ Label: database, migration
βββ Linked to feature
βββ Checklist for steps
βββ Deployment coordination
βββ Visible tracking
TASK EXAMPLE:
βββββββββββββββββββββββββββββββββββββ
Task: "Add email column to users"
Checklist:
βββ β Write migration
βββ β Write rollback
βββ β Test in staging
βββ β Time migration on prod-size data
βββ β Backup production
βββ β Execute migration
βββ β Verify success
βββ Complete process
RUNBOOK:
βββββββββββββββββββββββββββββββββββββ
NoteVault:
βββ Migration runbook
βββ Common procedures
βββ Rollback procedures
βββ Emergency contacts
βββ Documented process
Best Practices
For Migrations
Anti-Patterns
MIGRATION MISTAKES:
β Untested migrations
β No rollback plan
β Rename in single step
β Remove column with DROP
β Long locks on production
β No backups
β Editing executed migrations
β Assuming it will work