6 min read • Guide 369 of 877
Performance Optimization Workflow
Performance optimization without measurement is guessing. Good performance work starts with data, targets specific bottlenecks, and measures results. This guide covers a systematic approach to performance optimization.
Optimization Cycle
| Step | Action | Output |
|---|---|---|
| Measure | Baseline current | Data |
| Analyze | Find bottleneck | Target |
| Optimize | Fix issue | Change |
| Verify | Measure again | Result |
Setting Goals
Performance Targets
PERFORMANCE GOALS
═════════════════
DEFINE TARGETS:
─────────────────────────────────────
Be specific:
├── "Page load < 2 seconds"
├── "API response p95 < 200ms"
├── "Throughput > 1000 req/sec"
├── "Error rate < 0.1%"
├── Measurable targets
└── User-focused
PERCENTILES:
─────────────────────────────────────
Use percentiles, not averages:
├── p50 (median): 50% faster than this
├── p95: 95% of requests faster
├── p99: 99% of requests faster
├── p99.9: for critical paths
└── Tail latency matters
Example:
├── Average: 100ms (hides problems)
├── p50: 50ms (half are fast)
├── p95: 150ms (most are fine)
├── p99: 2000ms (some very slow!)
└── p99 reveals real issues
SLA TARGETS:
─────────────────────────────────────
Service Level Agreements:
├── "99.9% of requests < 500ms"
├── "p99 < 1 second"
├── "Error rate < 0.01%"
├── Contractual obligations
└── Must meet
USER EXPERIENCE:
─────────────────────────────────────
Web vitals:
├── LCP (Largest Contentful Paint): < 2.5s
├── FID (First Input Delay): < 100ms
├── CLS (Cumulative Layout Shift): < 0.1
├── INP (Interaction to Next Paint): < 200ms
└── User-facing metrics
Baseline Measurement
Current State
BASELINE MEASUREMENT
════════════════════
BEFORE OPTIMIZATION:
─────────────────────────────────────
Measure current state:
├── Run load tests
├── Collect production metrics
├── Profile application
├── Document baseline
├── Compare later
└── Know starting point
LOAD TESTING:
─────────────────────────────────────
Tools:
├── k6, Locust, JMeter
├── Artillery, Gatling
├── Simulate real traffic
├── Measure under load
└── Find limits
Example k6 test:
import http from 'k6/http';
export const options = {
stages: [
{ duration: '1m', target: 100 },
{ duration: '3m', target: 100 },
{ duration: '1m', target: 0 },
],
};
export default function() {
http.get('https://api.example.com/users');
}
METRICS TO CAPTURE:
─────────────────────────────────────
├── Response times (p50, p95, p99)
├── Throughput (req/sec)
├── Error rate
├── CPU utilization
├── Memory usage
├── Database query times
├── External service latency
└── Complete picture
Finding Bottlenecks
Profiling
FINDING BOTTLENECKS
═══════════════════
PROFILING TOOLS:
─────────────────────────────────────
Application:
├── APM tools (Datadog, New Relic)
├── Language profilers
├── Flame graphs
├── Trace analysis
└── Where is time spent?
Database:
├── Slow query logs
├── Query explain plans
├── Index analysis
├── Connection pool stats
└── Database-specific
Infrastructure:
├── CPU/memory monitoring
├── Network latency
├── Disk I/O
├── Container metrics
└── Resource constraints
COMMON BOTTLENECKS:
─────────────────────────────────────
Database:
├── Missing indexes
├── N+1 queries
├── Slow queries
├── Connection exhaustion
└── Often the bottleneck
Network:
├── External API calls
├── Large payloads
├── Too many requests
├── No caching
└── Latency adds up
Application:
├── Inefficient algorithms
├── Memory leaks
├── Blocking operations
├── CPU-bound processing
└── Code problems
FLAME GRAPH ANALYSIS:
─────────────────────────────────────
│████████████████████████████████│
│█ db.query ██████████████████│
│█ http.get █████│ │ process │
│█ │ │ json ███│ │ result │
└────────────────────────────────┘
Reading:
├── Width = time spent
├── Stacks show call hierarchy
├── Wide = slow (optimize this)
├── Find widest blocks
└── Visual bottleneck identification
Optimization Techniques
Common Fixes
OPTIMIZATION TECHNIQUES
═══════════════════════
DATABASE:
─────────────────────────────────────
Add missing indexes:
CREATE INDEX idx_users_email ON users(email);
Fix N+1 queries:
├── Use eager loading
├── Batch queries
├── Join instead of loop
└── Reduce query count
Query optimization:
├── EXPLAIN ANALYZE
├── Rewrite slow queries
├── Add WHERE clauses
├── Limit result sets
└── Efficient queries
CACHING:
─────────────────────────────────────
Layers:
├── Application cache (Redis)
├── Database query cache
├── CDN for static content
├── Browser caching
└── Appropriate cache levels
Cache patterns:
├── Cache-aside (most common)
├── Write-through
├── Read-through
├── TTL-based expiration
└── Invalidation strategy
API OPTIMIZATION:
─────────────────────────────────────
├── Pagination
├── Field selection
├── Compression (gzip)
├── Connection pooling
├── Async processing
├── Background jobs
└── Reduce work per request
FRONTEND:
─────────────────────────────────────
├── Code splitting
├── Lazy loading
├── Image optimization
├── Bundle size reduction
├── CDN for assets
└── Faster initial load
Verification
Measuring Results
VERIFY IMPROVEMENTS
═══════════════════
A/B COMPARISON:
─────────────────────────────────────
Before:
├── p50: 150ms
├── p95: 500ms
├── p99: 2000ms
└── Throughput: 500 req/s
After:
├── p50: 50ms (66% faster)
├── p95: 120ms (76% faster)
├── p99: 300ms (85% faster)
└── Throughput: 1500 req/s (3x)
LOAD TEST AGAIN:
─────────────────────────────────────
├── Same test as baseline
├── Same conditions
├── Compare results
├── Quantify improvement
└── Data-driven validation
PRODUCTION MONITORING:
─────────────────────────────────────
After deploy:
├── Watch metrics
├── Compare to before
├── User experience improved?
├── Error rate unchanged?
├── Real-world validation
└── Actual impact
DOCUMENT RESULTS:
─────────────────────────────────────
Performance improvement:
├── What was the problem
├── What was changed
├── Before metrics
├── After metrics
├── Percentage improvement
└── Future reference
GitScrum Integration
Tracking Performance Work
GITSCRUM FOR PERFORMANCE
════════════════════════
PERFORMANCE TASKS:
─────────────────────────────────────
├── Label: performance
├── Priority based on impact
├── Linked to metrics
├── Before/after documented
└── Tracked work
TASK STRUCTURE:
─────────────────────────────────────
Task: "Optimize user search API"
Description:
├── Current p95: 800ms
├── Target p95: 200ms
├── Cause: Missing index + N+1 query
├── Approach: Add index, eager load
Acceptance:
├── p95 < 200ms
├── Throughput > 500 req/s
├── Verified in production
└── Clear criteria
DOCUMENTATION:
─────────────────────────────────────
NoteVault:
├── Performance baselines
├── Optimization history
├── Common issues
├── Runbooks
└── Knowledge base
Best Practices
For Performance Optimization
- Measure first — No guessing
- One change at a time — Know what helped
- Target bottlenecks — Biggest impact first
- Verify in production — Real-world results
- Document everything — Learn for next time
Anti-Patterns
PERFORMANCE MISTAKES:
✗ Optimizing without measuring
✗ Premature optimization
✗ Multiple changes at once
✗ Ignoring percentiles
✗ Lab only, no production
✗ No baseline
✗ Not monitoring after
✗ Micro-optimizations