Sala 0.04 EEG & Online
Online via E-mail
Abstract:
Estimating causal impacts of a sudden shock or policy change for affected areas requires a comparison of the evolution of the outcome of interest in the post-treatment period with a counterfactual estimation of its trend in a no-treatment scenario. This is rarely an easy task, and it gets even more difficult if there is no availability of an original control group, i.e., when there are no “untreated” areas but only “treated” ones, and the treatment is simultaneous rather than staggered. We propose the machine learning control method (MLCM), which provides a solution to this task for time-series cross-sectional (TSCS) data. We forecast via flexible machine learning routines, what would have happened to each area in the absence of the sudden shock. Accurate algorithmic forecasting is made possible by exploiting many pre-treatment variables collected for several periods and all areas. Such forecasts act as area-specific counterfactuals in the no-treatment, ‘business-as-usual’ scenario. This way, one can readily retrieve individual time-varying treatment effects as the difference, for each area, between the observed outcome and the ML-generated forecast. Importantly, we show that our approach outperforms simpler statistical approaches in terms of predictive accuracy, and is robust to endogeneity concerns and contamination bias. The COVID-19 generalized shock provides the perfect setting to put into practice the MLCM approach: we present three different applications on the disaggregated local impacts of COVID-19 in Italy concerning mortality, economic, and relocation outcomes. Lastly, we describe future challenges to boost the predictive performance and broaden the applicability of our MLCM.