8 min read • Guide 797 of 877
Feature Flags and Progressive Rollout
Feature flags separate deployment from release. GitScrum helps teams track flagged features and coordinate progressive rollouts.
Feature Flag Basics
Why Feature Flags
FEATURE FLAG BENEFITS:
┌─────────────────────────────────────────────────────────────┐
│ │
│ WITHOUT FLAGS: │
│ ────────────── │
│ Deploy = Release │
│ All users get feature immediately │
│ Rollback requires new deployment │
│ Big bang risk │
│ │
│ WITH FLAGS: │
│ ─────────── │
│ Deploy ≠ Release │
│ Enable when ready │
│ Disable instantly if issues │
│ Gradual rollout possible │
│ │
│ ─────────────────────────────────────────────────────────── │
│ │
│ FLAG USE CASES: │
│ │
│ RELEASE TOGGLES: │
│ Deploy incomplete feature, enable when done │
│ "Ship dark" - code in prod but off │
│ │
│ EXPERIMENT TOGGLES: │
│ A/B test features │
│ Compare metrics between variants │
│ │
│ OPS TOGGLES: │
│ Kill switch for problematic features │
│ Graceful degradation │
│ │
│ PERMISSION TOGGLES: │
│ Premium features for paying users │
│ Beta features for early adopters │
│ │
│ ─────────────────────────────────────────────────────────── │
│ │
│ CODE EXAMPLE: │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ if (featureFlags.isEnabled('new-search')) { ││
│ │ return <NewSearchComponent />; ││
│ │ } ││
│ │ return <OldSearchComponent />; ││
│ └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘
Progressive Rollout
Rollout Strategy
PROGRESSIVE ROLLOUT STAGES:
┌─────────────────────────────────────────────────────────────┐
│ │
│ STAGE 1: INTERNAL (0.1%) │
│ ───────────────────────── │
│ • Development team only │
│ • Catch obvious issues │
│ • Duration: 1-2 days │
│ │
│ STAGE 2: EMPLOYEES (1%) │
│ ──────────────────────── │
│ • All company employees │
│ • Real-world testing │
│ • Duration: 1-3 days │
│ │
│ STAGE 3: BETA USERS (5%) │
│ ──────────────────────── │
│ • Opted-in early adopters │
│ • Collect feedback │
│ • Duration: 3-5 days │
│ │
│ STAGE 4: CANARY (10%) │
│ ───────────────────── │
│ • Random user sample │
│ • Monitor metrics │
│ • Duration: 2-3 days │
│ │
│ STAGE 5: PARTIAL (25%, 50%) │
│ ─────────────────────────── │
│ • Larger population │
│ • Confirm at scale │
│ • Duration: 1-2 days each │
│ │
│ STAGE 6: FULL ROLLOUT (100%) │
│ ──────────────────────────── │
│ • All users │
│ • Continue monitoring │
│ • Remove flag after stable │
│ │
│ ─────────────────────────────────────────────────────────── │
│ │
│ AT EACH STAGE: │
│ • Monitor error rates │
│ • Watch performance metrics │
│ • Collect user feedback │
│ • Ready to pause or rollback │
└─────────────────────────────────────────────────────────────┘
Rollout Tracking
TRACKING ROLLOUTS:
┌─────────────────────────────────────────────────────────────┐
│ │
│ FEATURE FLAG TASK: │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ FLAG-012: New Search Experience Rollout ││
│ │ ││
│ │ FLAG NAME: new-search-experience ││
│ │ CREATED: Jan 10, 2025 ││
│ │ OWNER: @product-lead ││
│ │ ││
│ │ ROLLOUT PLAN: ││
│ │ ☑ Jan 12: Internal (0.1%) ││
│ │ ☑ Jan 14: Employees (1%) ││
│ │ ☑ Jan 17: Beta users (5%) ││
│ │ ☑ Jan 20: Canary (10%) ││
│ │ ☐ Jan 22: 25% (pending approval) ││
│ │ ☐ Jan 24: 50% ││
│ │ ☐ Jan 27: 100% ││
│ │ ☐ Feb 10: Remove flag ││
│ │ ││
│ │ METRICS TO WATCH: ││
│ │ • Search latency (p95 < 200ms) ││
│ │ • Click-through rate (≥ baseline) ││
│ │ • Error rate (< 0.1%) ││
│ │ ││
│ │ CURRENT STATUS: 10% rollout ││
│ │ Metrics: All green ✅ ││
│ │ ││
│ │ ROLLBACK TRIGGER: ││
│ │ • Error rate > 1% ││
│ │ • Latency p95 > 500ms ││
│ │ • CTR drops > 20% ││
│ └─────────────────────────────────────────────────────────┘│
│ │
│ DAILY CHECK: │
│ • Are metrics within bounds? │
│ • Any user complaints? │
│ • Ready for next stage? │
└─────────────────────────────────────────────────────────────┘
Flag Management
Flag Lifecycle
FEATURE FLAG LIFECYCLE:
┌─────────────────────────────────────────────────────────────┐
│ │
│ LIFECYCLE STAGES: │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ CREATE │──→│ ROLLOUT │──→│ STABLE │──→│ REMOVE │ │
│ │ │ │ │ │ │ │ │ │
│ │ Flag │ │ 0%→100% │ │ 100% │ │ Delete │ │
│ │ defined │ │ monitor │ │ confirm │ │ code │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ ─────────────────────────────────────────────────────────── │
│ │
│ FLAG DOCUMENTATION: │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ FLAG REGISTRY ││
│ │ ││
│ │ FLAG PURPOSE STATUS OWNER ││
│ │ ──── ─────── ────── ───── ││
│ │ new-search Release toggle Rollout @alex ││
│ │ dark-mode Release toggle 100% @jordan││
│ │ premium-export Permission Active @sam ││
│ │ experimental-ai Experiment 5% @pat ││
│ │ kill-notifications Ops toggle Ready @ops ││
│ │ ││
│ │ CLEANUP NEEDED: ││
│ │ • old-checkout: 100% for 30 days → remove ⚠️ ││
│ │ • beta-dashboard: 100% for 60 days → remove 🔴 ││
│ └─────────────────────────────────────────────────────────┘│
│ │
│ CLEANUP RULES: │
│ ───────────── │
│ • Flag at 100% for 2+ weeks → Schedule removal │
│ • Flag at 0% for 4+ weeks → Consider deletion │
│ • Create cleanup task when flag reaches 100% │
│ • Review flag inventory monthly │
└─────────────────────────────────────────────────────────────┘
Avoiding Tech Debt
FLAG HYGIENE:
┌─────────────────────────────────────────────────────────────┐
│ │
│ THE PROBLEM: │
│ ──────────── │
│ Flags accumulate if not cleaned up │
│ Code becomes confusing with many branches │
│ "Which code path is actually running?" │
│ │
│ PREVENTION: │
│ ─────────── │
│ │
│ EXPIRATION DATES: │
│ Every flag has a planned removal date │
│ Set when creating flag │
│ Task created for cleanup │
│ │
│ FLAG LIMITS: │
│ Max 20 active flags per service │
│ Must remove one to add one (at limit) │
│ │
│ REGULAR REVIEW: │
│ Monthly flag inventory review │
│ "Which flags can we remove?" │
│ │
│ ─────────────────────────────────────────────────────────── │
│ │
│ CLEANUP TASK: │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ TECH-045: Remove new-search flag ││
│ │ ││
│ │ FLAG: new-search-experience ││
│ │ STATUS: 100% for 3 weeks ││
│ │ STABLE: Yes, all metrics normal ││
│ │ ││
│ │ TASKS: ││
│ │ ☐ Remove feature flag checks from code ││
│ │ ☐ Remove old code path ││
│ │ ☐ Remove flag from configuration ││
│ │ ☐ Update tests ││
│ │ ☐ Deploy and verify ││
│ │ ││
│ │ ESTIMATE: 2 points ││
│ │ PRIORITY: Medium (tech debt) ││
│ └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘
Rollback Procedures
Handling Issues
ROLLBACK PROCESS:
┌─────────────────────────────────────────────────────────────┐
│ │
│ WHEN TO ROLLBACK: │
│ ───────────────── │
│ • Error rate exceeds threshold │
│ • Performance degradation │
│ • Critical user complaints │
│ • Security issue discovered │
│ │
│ HOW TO ROLLBACK: │
│ ───────────────── │
│ 1. Set flag to 0% │
│ 2. Verify old behavior restored │
│ 3. Notify team │
│ 4. Investigate issue │
│ │
│ ─────────────────────────────────────────────────────────── │
│ │
│ ROLLBACK TASK: │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ INCIDENT: New search causing timeout errors ││
│ │ ││
│ │ DETECTED: Jan 20, 2:15 PM ││
│ │ Error rate: 5% (threshold: 1%) ││
│ │ ││
│ │ ACTION TAKEN: ││
│ │ 2:17 PM - Flag set to 0% ││
│ │ 2:18 PM - Error rate returning to normal ││
│ │ 2:20 PM - Confirmed all traffic on old path ││
│ │ ││
│ │ FOLLOW-UP: ││
│ │ ☐ Root cause analysis ││
│ │ ☐ Fix identified issue ││
│ │ ☐ Add test to prevent regression ││
│ │ ☐ Re-plan rollout ││
│ │ ││
│ │ ROOT CAUSE: ││
│ │ Database query not optimized for 10% load ││
│ │ Need index before resuming rollout ││
│ └─────────────────────────────────────────────────────────┘│
│ │
│ KEY: Rollback is NOT failure │
│ It's the system working as designed │
│ Better to catch issues early │
└─────────────────────────────────────────────────────────────┘