Abstract: Linear mixed models (LMMs) are used extensively to model dependecies of observations in
linear regression and are used extensively in many application areas. Parameter estimation for
LMMs can be computationally prohibitive on big data. State-of-the-art learning algorithms
require computational complexity which depends at least linearly on the dimension p of the
covariates, and often use heuristics that do not offer theoretical guarantees. We present scalable
algorithms for learning high-dimensional LMMs with sublinear computational complexity dependence on p. Key to our approach are novel dual estimators which use only kernel functions
of the data, and fast computational techniques based on the subsampled randomized Hadamard
transform. We provide theoretical guarantees for our learning algorithms, demonstrating the
robustness of parameter estimation. Finally, we complement the theory with experiments on
large synthetic and real data.
0 Replies
Loading