论文标题
使用实验在观测研究中纠正选择
Using Experiments to Correct for Selection in Observational Studies
论文作者
论文摘要
研究人员越来越多地访问了两种类型的数据:(i)大量的观察数据集(例如,班级大小)不是随机的,但是观察到了几种主要结果(例如毕业率)和次要结果(例如,测试得分),并且(ii)实验数据是随机的,但仅观察到次要结果。我们开发了一种新方法来估计这种情况下对主要结果的影响。我们使用次级结果及其基于实验治疗效果的预测价值之间的差异来衡量观察数据中的选择偏差。根据新的假设,即我们称其为潜在的不满意性,对选择偏差的这种估计值对主要结果的治疗效果进行了公正的估计,这要求相同的混杂因素影响主要和次要结果。潜在的不足性削弱了常用替代估计量的基础假设。我们应用估计器来确定三年级规模对学生成绩的影响。在观察学区数据中,使用OLS回归对测试得分的估计影响与田纳西州星实验的估计相反。相反,观察数据中选择校正的估计值复制了实验估计。我们的估计器表明,将班级规模降低25%可将高中毕业率提高0.7个百分点。控制可观察物不会改变OLS估计值,这表明实验选择校正可以消除无法用标准控件解决的偏差。
Researchers increasingly have access to two types of data: (i) large observational datasets where treatment (e.g., class size) is not randomized but several primary outcomes (e.g., graduation rates) and secondary outcomes (e.g., test scores) are observed and (ii) experimental data in which treatment is randomized but only secondary outcomes are observed. We develop a new method to estimate treatment effects on primary outcomes in such settings. We use the difference between the secondary outcome and its predicted value based on the experimental treatment effect to measure selection bias in the observational data. Controlling for this estimate of selection bias yields an unbiased estimate of the treatment effect on the primary outcome under a new assumption that we term latent unconfoundedness, which requires that the same confounders affect the primary and secondary outcomes. Latent unconfoundedness weakens the assumptions underlying commonly used surrogate estimators. We apply our estimator to identify the effect of third grade class size on students outcomes. Estimated impacts on test scores using OLS regressions in observational school district data have the opposite sign of estimates from the Tennessee STAR experiment. In contrast, selection-corrected estimates in the observational data replicate the experimental estimates. Our estimator reveals that reducing class sizes by 25% increases high school graduation rates by 0.7 percentage points. Controlling for observables does not change the OLS estimates, demonstrating that experimental selection correction can remove biases that cannot be addressed with standard controls.