Lectures on Inference from Experiments

任课教师 Speaker:Per Johansson
时间 Time: 每周五 9:50-11:25, 2019-12-20/27 & 2020-1-3;
地点 Venue:清华大学近春园西楼三层报告厅

课程描述 Description

1. Inference from randomized experiments:  The relationship between  Fisher and  Neyman-Pearson

Abstract: The difference between the Fisher and Neyman inference for two-sample experiments has been discussed and studied by several scholars since the 1930th. Lehmann (1959) showed that under certain conditions the two tests are asymptotically equivalent over random sampling from a super population. Recently, Ding (2017) discussed the asymptotic properties for inference to the units within a sample, and showed that using the normal approximation of the Fisher test under the assumption of the sharp null hypothesis gives poor power performance under the alternative as opposed to Neyman. In this paper we compare Fisher and Neyman using an exact version of the Fisher test in Monte Carlo simulations. We conclude that Lehmanns results apply also for inference within samples with a slight indication that Fisher has better power properties than Neyman for small samples, even under normality.

2.Rerandomization strategies for balancing covariates using pre-experimental longitudinal data

Abstract: This paper considers experimental design based on the strategy of rerandomization to increase the efficiency in experiments. Two aspects of rerandomization are addressed. First, we propose a two-stage allocation sample scheme for randomization inference to the units in balanced experiments that guarantees that the difference-in-mean estimator is an unbiased estimator of the sample average treatment effect for any experiment, conserves the exactness of randomization inference, and halves the time consumption of the rerandomization design. Second, we propose a rank-based covariate-balance measure which can take into account the estimated relative weight of each covariate. Several strategies for estimating these weights using pre-experimental data are proposed. Using Monte Carlo simulations, the proposed strategies are compared to complete randomization and Mahalanobis-based rerandomization. An empirical example is given where the power of a mean difference test of electricity consumption of 54 households is increased by 99%, in comparison to complete randomization, using one of the proposed designs based on high frequency longitudinal electricity consumption data.

3. On the relation between stratified randomization and rerandomization

Abstract: Rerandomization is a strategy for improving balance on observed covariates in experiments. It was proposed as a complement to traditional blocked designs and several scholars recommend researchers to `block on what you can and rerandomize on what you cannot'. However, the relationship and differences between blocking, rerandomization, and the combination of the two, has not been previously investigated. In this paper, we show that block designs can be recreated by rerandomization, and explain why in most situations blocking on binary covariates followed by rerandomization on continuous covariates is more efficient than rerandomization on all covariates at once.