Project Estimation | Story Points & Planning Poker
Estimate projects with story points, Planning Poker, and T-shirt sizing. GitScrum tracks velocity and helps teams calibrate estimates over time.
8 min read
Accurate estimation is the foundation of realistic planningβunderestimate and you miss deadlines, overestimate and you lose trust. GitScrum's effort points and historical velocity data help teams calibrate estimates against actual performance, improving accuracy over time. The key is treating estimation as a skill to develop, not a precise science.
Estimation Approaches
| Technique | Best For | Accuracy | Effort |
|---|---|---|---|
| Story Points | Sprint planning | Medium-High | Medium |
| Planning Poker | Team consensus | Medium-High | Medium |
| T-Shirt Sizing | Roadmap planning | Low-Medium | Low |
| Historical Data | Forecasting | High | Low |
| Three-Point | Risk-aware estimates | Medium-High | High |
Story Point Estimation
STORY POINT FUNDAMENTALS
WHAT POINTS REPRESENT:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Story points measure relative effort: β
β β
β Effort = Complexity + Uncertainty + Volume β
β β
β NOT measured in hours because: β
β βββ Different people work at different speeds β
β βββ Same person varies day to day β
β βββ Points are more stable over time β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
POINT SCALE (Fibonacci):
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1 point: Trivial, well-understood β
β Example: Text change, config update β
β β
β 2 points: Small, straightforward β
β Example: Add field to form β
β β
β 3 points: Medium, some complexity β
β Example: New API endpoint β
β β
β 5 points: Large, multiple components β
β Example: Feature with UI + backend β
β β
β 8 points: Very large, significant unknowns β
β Example: New integration β
β Consider: Should this be split? β
β β
β 13 points: Epic-sized, too large β
β Must be broken down β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
REFERENCE STORIES:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Establish team calibration stories: β
β β
β "Adding a new dashboard widget" = 3 points β
β "Simple bug fix" = 1 point β
β "New feature end-to-end" = 5-8 points β
β β
β Compare new stories to references: β
β "Is this harder or easier than the widget?" β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Planning Poker
PLANNING POKER PROCESS
SETUP:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Participants: Development team (everyone who β
β will work on the items) β
β β
β Cards: 0, 1, 2, 3, 5, 8, 13, ?, β β
β βββ 0: Already done or trivial β
β βββ 1-13: Story point estimates β
β βββ ?: Need more information β
β βββ β: Need a break β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
PROCESS:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. Product Owner reads user story β
β β
β 2. Team asks clarifying questions β
β (Time-box: 2-3 minutes) β
β β
β 3. Everyone privately selects a card β
β β
β 4. All reveal simultaneously β
β β
β 5. Discuss differences: β
β βββ Highest: "Why do you think 8?" β
β βββ Lowest: "Why do you think 2?" β
β βββ Discussion surfaces hidden complexity β
β β
β 6. Re-vote if needed β
β (Usually converges in 2 rounds) β
β β
β 7. Record consensus estimate β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
DISCUSSION EXAMPLES:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Scenario: Votes are 2, 3, 3, 5, 8 β
β β
β Facilitator: "@high, why 8?" β
β @high: "We need to update the legacy auth β
β system, which is always problematic." β
β β
β Facilitator: "@low, why 2?" β
β @low: "I thought we were just adding a β
β button. I didn't realize auth was involved." β
β β
β Result: Team now has same understanding β
β Re-vote: 5, 5, 5, 5, 8 β Consensus at 5 β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
T-Shirt Sizing
T-SHIRT SIZING FOR ROADMAP
T-SHIRT SCALE:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β XS: < 1 week of work β
β S: 1-2 weeks β
β M: 2-4 weeks β
β L: 1-2 months β
β XL: 2-3 months β
β XXL: 3+ months (needs breakdown) β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
ROADMAP EXAMPLE:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Q2 2025 Initiatives: β
β β
β βββ Mobile app redesign [L] β
β βββ API v3 migration [XL] β
β βββ Performance improvements [M] β
β βββ User dashboard refresh [M] β
β βββ Security audit fixes [S] β
β β
β Capacity: 1 L + 2 M + 2 S per quarter β
β Current load: 1 XL + 1 L + 2 M + 1 S β
β Decision: Defer API v3 or get more capacity β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
CONVERTING T-SHIRT TO SPRINT POINTS:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β After roadmap approval, break down: β
β β
β Mobile app redesign [L]: β
β βββ UX research & design: 20 pts β
β βββ Core navigation: 25 pts β
β βββ Feature parity: 35 pts β
β βββ Polish & testing: 15 pts β
β Total: 95 pts β 3 sprints at 30 pts/sprint β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Three-Point Estimation
THREE-POINT ESTIMATION
FORMULA:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Expected = (Optimistic + 4ΓMostLikely + Pess) β
β ββββββββββββββββββββββββββββββββ β
β 6 β
β β
β Standard Deviation = (Pessimistic - Optimistic)β
β ββββββββββββββββββββββ β
β 6 β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
EXAMPLE:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Feature: Payment integration β
β β
β Optimistic: 2 weeks (everything goes right)β
β Most Likely: 4 weeks (normal bumps) β
β Pessimistic: 8 weeks (major issues) β
β β
β Expected = (2 + 4Γ4 + 8) / 6 = 4.3 weeks β
β Std Dev = (8 - 2) / 6 = 1 week β
β β
β Estimate: 4-5 weeks (with confidence interval) β
β Tell stakeholders: "Plan for 5 weeks" β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Improving Estimates
ESTIMATION CALIBRATION
TRACK ACTUAL VS ESTIMATED:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Sprint 22 Analysis: β
β β
β Story Estimated Actual Variance β
β ββββββββββββββββββββββββββββββββββββββββββ β
β User profile 3 3 0% β
β Search filter 5 8 +60% β
β Dashboard 5 4 -20% β
β API endpoint 3 3 0% β
β Report export 5 7 +40% β
β β
β Average variance: +16% β
β Largest miss: Search filter β
β β
β Discuss: Why was search filter underestimated? β
β Learning: Filtering with multiple data sources β
β is more complex than expected. β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
ESTIMATION IMPROVEMENT LOOP:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. Estimate before sprint β
β 2. Track actual time during sprint β
β 3. Compare at sprint end β
β 4. Discuss large variances in retro β
β 5. Update reference stories if needed β
β 6. Apply learnings to next estimates β
β β
β Over time: Team estimates become more accurate β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
COMMON ESTIMATION MISTAKES:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β β Forgetting testing time β
β β Ignoring code review cycles β
β β Not accounting for meetings β
β β Assuming best-case scenario β
β β Comparing to senior dev's speed β
β β Missing integration complexity β
β β Underestimating unknowns β
β β
β Fix: Include all work in estimate, add buffer β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Best Practices
Anti-Patterns
β One person estimates for the team
β Converting points directly to hours
β Never reviewing estimate accuracy
β Estimating items that aren't understood
β Punishing missed estimates
β Sandbagging all estimates