Keywords: Nearest neighbor algorithms, matrix completion, Python package, recommendation systems, personalized health, causal inference, panel data, LLM evaluation, general ML
TL;DR: We introduce a unified Python package and testbed that consolidates a broad class of nearest neighbor-based matrix completion methods.
Abstract: Nearest neighbor (NN) methods have re-emerged as competitive tools for matrix completion, offering strong empirical performance and recent theoretical guarantees, including entry-wise error bounds, confidence intervals, and minimax optimality. Despite their simplicity, recent work has shown that NN approaches are robust to a range of missingness patterns and effective across diverse applications.
This paper introduces **N**$^2$, a unified Python package and testbed that consolidates a broad class of NN-based methods through a modular, extensible interface. Built for both researchers and practitioners, **N**$^2$ supports rapid experimentation and benchmarking. Using this framework, we introduce a new NN variant that achieves state-of-the-art results in several settings. We also release a benchmark suite of real-world datasets—from healthcare and recommender systems to causal inference and LLM evaluation—designed to stress-test matrix completion methods beyond synthetic scenarios. Our experiments demonstrate that while classical methods excel on idealized data, NN-based techniques consistently outperform them in real-world settings.
Submission Number: 19
Loading