
Understanding S, T, and X Learners: Meta-Learners for Causal Inference
When estimating causal effects, we often want to go beyond average treatment effects and understand how treatments impact different individuals or subgroups. This is where meta-learners like S-Learner, T-Learner, and X-Learner come in handy. Let’s explore these powerful tools for estimating heterogeneous treatment effects, with a focus on intuition and practical implementation using the DoWhy library.
S-Learner: The Simple Starter
Intuition
Imagine you’re a chef trying to predict how tasty a dish will be based on its ingredients. S-Learner is like considering all ingredients, including a special spice (our treatment), in one big recipe. It doesn’t treat the special spice any differently from other ingredients.
How it Works
- S-Learner trains one model on all data, treating the treatment as just another feature.
- To estimate the treatment effect for an individual, it: a) Predicts the outcome with the treatment b) Predicts the outcome without the treatment c) Subtracts these predictions
Key Insight
S-Learner can capture complex interactions between the treatment and other features naturally. However, if the treatment effect is subtle compared to other factors, it might not give it enough importance.
When to Use
- When you have a large dataset
- When you suspect strong interactions between treatment and other features
- As a baseline model to compare against other approaches
estimate = model.estimate_effect(
identified_estimand=estimand,
method_name='backdoor.econml.metalearners.SLearner',
target_units='ate',
method_params={
'init_params': {
'overall_model': LGBMRegressor(n_estimators=500, max_depth=10)
},
'fit_params': {}
}
)
T-Learner: The Separate Models Approach
Intuition
T-Learner is like having two separate chefs: one who always uses the special spice, and one who never does. Each chef perfects their own recipe independently.
How it Works
- Split the data into treated and untreated groups
- Train one model on the treated group
- Train another model on the untreated group
- To estimate the treatment effect, predict with both models and subtract
Key Insight
By using separate models, T-Learner ensures the treatment effect isn’t ignored. It allows for completely different relationships between features and outcomes in treated vs untreated groups.
When to Use
- When you suspect the treatment fundamentally changes how other features relate to the outcome
- When you have enough data to train two separate models effectively
- When you’re concerned S-Learner might underestimate the treatment effect
estimate = model.estimate_effect(
identified_estimand=estimand,
method_name='backdoor.econml.metalearners.TLearner',
target_units='ate',
method_params={
'init_params': {
'models': [
LGBMRegressor(n_estimators=200, max_depth=10),
LGBMRegressor(n_estimators=200, max_depth=10)
]
},
'fit_params': {}
}
)
X-Learner: The Cross-Learning Approach
Intuition
X-Learner is like having the two chefs from T-Learner, but then bringing in a food critic who tastes both versions of each dish and provides detailed feedback on the differences.
How it Works
- Start like T-Learner, training separate models for treated and untreated groups
- Use these models to impute “missing” outcomes:
- For treated units, estimate what would have happened without treatment
- For untreated units, estimate what would have happened with treatment
- Calculate imputed treatment effects by comparing actual to imputed outcomes
- Train two more models to predict these imputed treatment effects
- Combine the predictions from these models using propensity scores
Key Insight
X-Learner tries to learn the treatment effect directly, rather than just the outcomes. It’s particularly good at handling imbalanced datasets where one group (treated or untreated) is much larger than the other. (Because of the Final weighting step)
When to Use
- When you have imbalanced treatment groups
- When you suspect heterogeneous treatment effects (effects vary significantly across individuals)
- When you have a large enough dataset to support the more complex modeling process
estimate = model.estimate_effect(
identified_estimand=estimand,
method_name='backdoor.econml.metalearners.XLearner',
target_units='ate',
method_params={
'init_params': {
'models': [
LGBMRegressor(n_estimators=50, max_depth=10),
LGBMRegressor(n_estimators=50, max_depth=10)
],
'cate_models': [
LGBMRegressor(n_estimators=50, max_depth=10),
LGBMRegressor(n_estimators=50, max_depth=10)
]
},
'fit_params': {},
}
)
Comparative Intuition
Imagine you’re trying to understand how a new fertilizer affects different plants:
- S-Learner is like planting all your seeds in one big field, some with fertilizer and some without, and then trying to figure out the fertilizer’s effect by looking at the whole field.
- T-Learner is like having two separate fields – one with fertilizer and one without – and comparing how plants grow in each.
- X-Learner is like having those two fields, but then also trying to imagine how each plant from the fertilized field would have grown without fertilizer, and vice versa. It then uses this “imagined” data to get a more nuanced understanding of the fertilizer’s effects.
Each approach has its strengths, and the choice often depends on your specific dataset and research question. The beauty of using a framework like DoWhy is that you can easily experiment with different learners and compare their results, gaining deeper insights into your causal effects.