- TL;DR: Inference Procedure for High-Dimensional Model with Missing Data
- Abstract: In this paper, we consider how to conduct statistical inference in a high-dimensional linear model where the response variable has missing values. Motivated by the fact that the missingness mechanism, albeit usually regarded as a nuisance, is largely unknown and difficult to specify, we adopt the conditional likelihood approach such that this nuisance can be completely ignored in our procedure. We establish the asymptotic theory of the proposed estimate and develop an easy-to-implement algorithm via some data manipulation strategy. Furthermore, we propose a data perturbation method for the variance estimation. The proposed methodology has broad potential for application in patient-reported outcomes or electronic health records. Although we do not have space to present our numerical results in this four-page extended abstract, we will definitely do so at the workshop if it is selected.
- Keywords: High-Dimensionality, Missing Data, Missingness Mechanism, Regularization, Variable Selection, Post-Selection Inference