论文标题
针对观察到的协变量不平衡的实验的推断
Inference in experiments conditional on observed imbalances in covariates
论文作者
论文摘要
传统上,双盲随机对照试验被视为因果推断的金标准,因为均值估计器是实验中平均治疗效应的无偏估计量。但是,该估计器在所有可能的随机化中都公正的事实并不意味着任何给定估计值接近真正的治疗效果。同样,尽管平均在治疗组和对照组之间将平衡预定的协变量,但在给定的实验中可能会观察到很大的失衡,因此研究人员可能希望使用线性回归在此类协变量上条件。本文研究了观察到的协变量差异,研究了均值和OLS估计量\ emph {条件}的理论特性。通过得出条件估计器的统计特性,我们可以为如何处理协变量失衡建立指导。我们研究了使用OLS的推断,以及Fisher的精确测试的新版本,其中随机分布来自所有可能的分配向量的一小部分。
Double blind randomized controlled trials are traditionally seen as the gold standard for causal inferences as the difference-in-means estimator is an unbiased estimator of the average treatment effect in the experiment. The fact that this estimator is unbiased over all possible randomizations does not, however, mean that any given estimate is close to the true treatment effect. Similarly, while pre-determined covariates will be balanced between treatment and control groups on average, large imbalances may be observed in a given experiment and the researcher may therefore want to condition on such covariates using linear regression. This paper studies the theoretical properties of both the difference-in-means and OLS estimators \emph{conditional} on observed differences in covariates. By deriving the statistical properties of the conditional estimators, we can establish guidance for how to deal with covariate imbalances. We study both inference with OLS, as well as with a new version of Fisher's exact test, where the randomization distribution comes from a small subset of all possible assignment vectors.