Extensions of the Synthetic Control Method to Multiple Dimensions
May 13, 2026
PAER-2026-22
Yuansen Li, PhD Student
Abstract
The synthetic control method (SCM) is a widely used causal inference approach for panel data settings in which only a single unit is exposed to treatment. It reframes the problem of causal inference as one of causal prediction. The canonical synthetic control method is initially designed for outcomes that lie in Euclidean space, where each observation at each time period is a scalar and observed over a consistent time series. This dissertation relaxes these original restrictions and applies the synthetic control framework to a broader class of data structures. Specifically, the first essay applies the SCM to DMSP nighttime lights data. These data are visible-light measurements collected during the night by multiple overlapping satellites with relatively short lifespans. As a consequence, the outcome observation does not follow a consistent time series, which falls outside the setting of the canonical synthetic control method. Once incorporated into the SCM optimization problem, the time series are treated as cross-sectional observations. We leverage this feature to extend the canonical SCM framework, enabling the prediction of nighttime lights measurements for satellites that were not operational in the pre-treatment period. In the second essay, the synthetic control framework is extended to settings beyond Euclidean space. I develop a novel, data-driven method, synthetic mapping, to estimate individual treatment effects in the classical synthetic control framework with disaggregated data. Synthetic mapping draws inspiration from digital image processing to view the dynamics of spatial-temporal data as a sequence of images. Wasserstein geometry is then used to construct counterfactual images for predicting spatiotemporal data. Intuitively, the second essay considers settings in which the outcome at each time period is a two-dimensional image. The third essay studies distributional outcomes using a recently developed distributional synthetic control method. It analyzes the long-run and heterogeneous effects of GE maize adoption on time-conditional maize yield distributions in five major maize-producing provinces in China. Put differently, the third essay considers settings in which the outcome at each time period is a univariate random variable. The first two essays focus primarily on methodological developments, whereas the third emphasizes empirical application. Overall, the dissertation extends the synthetic control framework to settings involving more complex data structures and multiple dimensions.