Capacity Planning

Problem

Forecast future demand for resources — staff headcount, server compute, warehouse space, call centre seats — and plan capacity accordingly. Capacity planning operates on longer time horizons (weeks to months) than real-time demand forecasting, and the cost of under-capacity (service degradation, SLA breaches) and over-capacity (idle costs) are both significant.

Users / Stakeholders

RoleDecision
Operations managerStaffing levels; shift planning
FinanceCost forecasts; budget allocation
Technology leaderInfrastructure provisioning; cloud spend
HR/WorkforceHiring plans; contractor engagement
CustomerService quality; SLA compliance

Domain Context

  • Long lead times: Hiring, training, and infrastructure provisioning have lead times of weeks to months. The forecast must be accurate far enough ahead to enable action.
  • Hierarchical: Total demand breaks down by region, product, channel, time-of-day. Each level has different planning implications.
  • Demand drivers: Business volume (policy renewals, order volumes, support tickets) drives resource demand. External events (weather, regulation changes, product launches) create step changes.
  • Workforce constraints: Minimum staffing levels, shift patterns, skills-based routing — operational constraints that pure forecasting models don’t capture.
  • Cloud auto-scaling: For compute capacity, auto-scaling handles short-term (minutes to hours) adjustments. Capacity planning focuses on baseline provisioning and cost optimisation.

Inputs and Outputs

Features:

Historical demand: volume_last_52w, volume_same_period_prior_year
Calendar: day_of_week, holiday_flags, product_season, renewal_cycle
Business drivers: new_business_rate, retention_rate, product_mix
External: economic_index, weather (call centre: bad weather = more calls)
Pipeline: sales_pipeline, marketing_spend, campaign_schedule
Operational: handling_time_per_unit, automation_rate

Output:

demand_forecast:     Weekly volume by resource type for next 26 weeks
capacity_requirement: FTE required, seats required, compute_units required
hiring_recommendation: Headcount delta with lead time factored in
budget_forecast:     Cost at recommended capacity
risk_scenario:       P90 demand scenario → capacity buffer recommendation

Decision or Workflow Role

Quarterly capacity review (strategic) + monthly rolling update
  ↓
Demand forecast: 26-week forward view by resource type
  ↓
Capacity gap analysis: forecast vs current approved headcount
  ↓
Scenario analysis: base / upside / downside scenarios
  ↓
Decisions: hire / train / redeploy / contract / defer
  ↓
Operational: weekly workforce scheduling (WFM system)
  ↓
Actuals vs forecast → model retraining

Modeling / System Options

ApproachStrengthWeaknessWhen to use
ARIMA + external regressorsHandles seasonality; interpretableLimited feature setSimple, stable demand
LightGBM with calendar + business featuresCaptures complex relationshipsRequires feature engineeringRich business driver data available
Prophet (Facebook)Automatic seasonality + holidays; interpretableSlow at scaleQuarterly planning; business stakeholder reporting
Erlang C (call centre)Industry standard for staffing; analyticalAssumes stationary Poisson processCall centre / service desk capacity

Recommended: LightGBM for forecast. Erlang C for translating call volume to agent headcount. Prophet for stakeholder-facing reports.

Deployment Constraints

  • Planning horizon: 26-week forward view. Weekly refresh. Scenario outputs needed.
  • Stakeholder reporting: Finance and HR teams are consumers — not data scientists. Visual dashboards with uncertainty ranges, not model outputs.
  • Decision latency: Hiring decisions need 8–12 weeks lead time. Forecast must be actionable within that window.

Risks and Failure Modes

RiskDescriptionMitigation
Demand step changeNew product launch or regulation → non-historical demandScenario planning; manual uplift
Forecast optimismSystematic under-forecast → chronic understaffingBias monitoring; post-season review
Long lead time riskHiring too late → SLA breachLead-time-adjusted planning; contractor buffer
Automation rate changeAutomation improves; model uses old handling timeHandling time as input parameter; quarterly review

Success Metrics

MetricTargetNotes
Forecast accuracy (MAPE)< 10% at 4-week horizonPlanning accuracy
SLA compliance> 99%Downstream service quality
Cost per unitStable or decreasingOperational efficiency
Capacity utilisation85–90%Asset efficiency without overloading
Hiring lead time met> 95%Operational outcome

Modeling

Reference Implementations

Adjacent Applications