Causal Inference

Problem Context

Observational data encodes correlation, not causation. A model predicting churn from support ticket volume cannot tell us whether calling support causes churn or whether both are caused by a bad product experience. Causal inference provides formal tools for answering what would happen if we intervene? — the question that drives decisions.

When to use causal methods vs standard predictive ML:

Predictive ML: best when all you need is the most accurate forecast of Y given X (no intervention contemplated)
Causal inference: required when you want to estimate the effect of an action (treatment), run policy simulations, or avoid acting on spurious correlations

Core Concepts

Potential outcomes (Rubin framework) For each unit i, define Y_i(1) = outcome if treated, Y_i(0) = outcome if untreated. The Individual Treatment Effect (ITE) is Y_i(1) - Y_i(0). We can only observe one potential outcome per unit — the fundamental problem of causal inference.

Average Treatment Effect (ATE): E[Y(1) - Y(0)] Average Treatment Effect on the Treated (ATT): E[Y(1) - Y(0) | T=1]

Confounders: variables that affect both treatment assignment and outcome. Failing to control for confounders yields biased effect estimates.

DAGs (Pearl framework): Directed Acyclic Graphs encoding causal assumptions. Identification analysis (d-separation, backdoor criterion) determines whether a causal effect is estimable from observational data and which variables to condition on.

Identification Strategies

Strategy	Assumption	Use when
Randomized experiment (RCT)	Randomisation eliminates confounding	You can run a controlled experiment
Propensity score matching / IPW	Unconfoundedness given observables	Treatment assignment depends only on observed covariates
Instrumental variables (IV)	Valid instrument Z: affects T but only affects Y through T	You have a natural instrument (e.g., lottery, policy cutoff)
Difference-in-differences (DiD)	Parallel trends in absence of treatment	Panel data with staggered rollout or pre/post comparison
Regression discontinuity (RDD)	Units just above/below cutoff are comparable	Continuous treatment assignment rule with a threshold
Synthetic control	Donor pool can construct a valid counterfactual	One treated unit, multiple controls, aggregate data

Practical Estimation

Propensity score methods:

from sklearn.linear_model import LogisticRegression
import numpy as np
 
# Estimate propensity scores
ps_model = LogisticRegression()
ps_model.fit(X, T)
ps = ps_model.predict_proba(X)[:, 1]
 
# Inverse Probability Weighting (IPW) ATE estimator
ipw_ate = np.mean(T * Y / ps - (1 - T) * Y / (1 - ps))

DoWhy + EconML (recommended for production causal analysis):

import dowhy
from dowhy import CausalModel
 
model = CausalModel(data=df, treatment="T", outcome="Y",
                    common_causes=["X1", "X2"])
identified = model.identify_effect()
estimate = model.estimate_effect(identified,
    method_name="backdoor.propensity_score_weighting")

Heterogeneous treatment effects (HTE) with causal forests (EconML):

from econml.dml import CausalForestDML
cf = CausalForestDML(n_estimators=200)
cf.fit(Y, T, X=X, W=W)   # W = controls, X = effect modifiers
te = cf.effect(X)          # ITE estimates per unit

Causal vs Predictive Models

A standard XGBoost classifier trained on observational data will learn the correlation P(Y=1 | X, T), not the causal effect E[Y(1) - Y(0) | X]. Naively reading feature importances as causal drivers is a common mistake. Use causal tools when:

You want to estimate ROI of an intervention
You need to compare two policies
Treatment assignment in training data is non-random

References

Pearl, J. (2009). Causality: Models, Reasoning, and Inference. Cambridge University Press.
Imbens, G. & Rubin, D. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press.
Athey, S. & Imbens, G. (2019). Machine Learning Methods for Estimating Heterogeneous Causal Effects. https://arxiv.org/abs/1504.01132
DoWhy documentation: https://py-why.github.io/dowhy/
EconML documentation: https://econml.azurewebsites.net/

Notes

Explorer

causal_inference

Causal Inference

Problem Context

Core Concepts

Identification Strategies

Practical Estimation

Causal vs Predictive Models

References

Links

Graph View

Table of Contents

Backlinks