Towards representation learning for treatment effect estimation with high-dimensional covariates

Clivio, O

Thesis

Towards representation learning for treatment effect estimation with high-dimensional covariates

Abstract:: Treatment effect estimation is typically difficult in the presence of high-dimensional covariates. In this thesis, we propose to use a representation, that is, the mapping of covariates to a lower-dimensional manifold, as an adjustment set in estimators of treatment effects.

In an introductory chapter, we review treatment effect estimation as well as challenges with high-dimensional features, both in statistics or machine learning generally and in treatment effect estimation in particular. The thesis’s contributions are then outlined.

As a first contribution, we extend the popular propensity score matching method to the use of multivariate representations in matching. This can be done by noticing that neural networks naturally satisfy the compositional nature of balancing scores from Rosenbaum and Rubin [1983], which are valid adjustment sets. As a result, intermediary hidden layers of a neural network modelling the propensity score can be used as adjustment sets. We also bound the original covariate imbalance using the representation imbalance in different settings. This method requires that the neural network model is correctly specified.

We address this in the second contribution, where we upper-bound the bias induced by adjusting for a representation instead of original covariates using a loss which builds on former work in density ratio regression and is differentiable. As a result, a measure of misspecification of the representation, which is added to the representation imbalance in the former bound, can directly be targeted. We also extend the analysis from matching to a more general weighting framework.

In a third contribution, we address the problem of poor overlap which is typical for high-dimensional covariates. We quantify the degree of overlap of a representation using a specific “overlap divergence” measure, and justify doing so by connecting it to the variance of estimators adjusting for the representation. We establish that outcome information is required to find a representation improving overlap, which stands in contrast to the two previous contributions. In a simplified setting with Gaussian covariates and generalised linear models for the outcome model and the propensity score model, we further strengthen the result: we describe a class of representations without confounding bias where the more predictive of the outcome the representation, the better its overlap.

We conclude with a summary of the contributions, key findings and limitations in all methods.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Clivio, O. (2025). Towards representation learning for treatment effect estimation with high-dimensional covariates [PhD thesis]. University of Oxford.

MLA Style

Clivio, O. Towards Representation Learning for Treatment Effect Estimation with High-Dimensional Covariates. 2025. University of Oxford, PhD thesis.

Chicago Style

Clivio, O. 2025. “Towards Representation Learning for Treatment Effect Estimation with High-Dimensional Covariates.” PhD thesis, University of Oxford.
Print

Access Document

Files:: Clivio_2025_Towards_representation_learning.pdf

(Preview, Dissemination version, pdf, 1.9MB, Terms of use)

Authors

+ Clivio, O More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Statistics
Oxford college:: St Peter's College
Role:: Author
ORCID:: 0000-0001-8668-4535

Contributors

+ Holmes, C

Institution:: University of Oxford
Division:: MPLS
Department:: Statistics
Role:: Supervisor

+ Engineering and Physical Sciences Research Council More from this funder

Funder identifier:: https://ror.org/0439y7842
Grant:: EP/S023151/1
Programme:: EPSRC Centre for Doctoral Training in Modern Statistics and Statistical Machine Learning

+ Novo Nordisk (Denmark) More from this funder

Funder identifier:: https://ror.org/0435rc536
Programme:: Joint Initiative for Causal Inference

DOI:: 10.5287/ora-vjeeyo7pd
Type of award:: DPhil
Level of award:: Doctoral
Awarding institution:: University of Oxford

Language:: English
Keywords:: treatment effect estimation

causal inference

representation learning

high dimensions
Subjects:: Statistics

Machine Learning
Deposit date:: 2026-06-08
ARK identifier:: ark:/29072/ora_be0afbb611b34faab0ab9878ec6def09

Terms of use

Copyright holder:: Oscar Clivio

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Thesis

Towards representation learning for treatment effect estimation with high-dimensional covariates

Actions

Access Document

Authors

Contributors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Thesis

Towards representation learning for treatment effect estimation with high-dimensional covariates

Actions

Access Document

Authors

Contributors

Funding

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions