报告人简介
王禹皓,清华大学交叉信息学院助理教授。本科毕业于清华大学自动化系,随后进入麻省理工学院计算机和电子工程系攻读博士学位,并任职于LIDS实验室。王禹皓教授在入职清华大学之前任职于剑桥大学统计学实验室并担任博士后研究员。王禹皓教授目前的研究兴趣集中在:Causal inference (因果推断);Experimental design (实验设计);High-dimensional statistics (高维统计);Distribution-free test (免分布假设检验)等领域。王禹皓教授曾有多篇文章发表于The Annals of Statistics,JRSSB,Biometrika,Bernoulli等顶尖统计学期刊以及NeurIPS等顶尖机器学习与人工智能会议。王禹皓教授还曾入选福布斯中国2021年度30 Under 30榜单:科学和医疗健康榜单。
内容简介
We study the identification and estimation of long-term treatment effects when both experimental and observational data are available. Since the long-term outcome is observed only after a long delay, it is not measured in the experimental data, but only recorded in the observational data. However, both types of data include observations of some short-term outcomes. In this paper, we uniquely tackle the challenge of persistent unmeasured confounders, i.e., some unmeasured confounders that can simultaneously affect the treatment, short-term outcomes and the long-term outcome, noting that they invalidate identification strategies in previous literature. To address this challenge, we exploit the sequential structure of multiple short-term outcomes, and develop three novel identification strategies for the average long-term treatment effect. We further propose three corresponding estimators and prove their asymptotic consistency and asymptotic normality. We finally apply our methods to estimate the effect of a job training program on long-term employment using semi-synthetic data. We numerically show that our proposals outperform existing methods that fail to handle persistent confounders.