Learning from Uncertain Data: From Possible Worlds to Possible Models

Jiongli Zhu; Su Feng; Boris Glavic; Babak Salimi

Learning from Uncertain Data: From Possible Worlds to Possible Models

Jiongli Zhu, Su Feng, Boris Glavic, Babak Salimi

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: data uncertainty, robustness verification, predictive multiplicity, abstract interpretation, linear regression

Abstract: We introduce an efficient method for learning linear models from uncertain data, where uncertainty is represented as a set of possible variations in the data, leading to predictive multiplicity. Our approach leverages abstract interpretation and zonotopes, a type of convex polytope, to compactly represent these dataset variations, enabling the symbolic execution of gradient descent on all possible worlds simultaneously. We develop techniques to ensure that this process converges to a fixed point and derive closed-form solutions for this fixed point. Our method provides sound over-approximations of all possible optimal models and viable prediction ranges. We demonstrate the effectiveness of our approach through theoretical and empirical analysis, highlighting its potential to reason about model and prediction uncertainty due to data quality issues in training data.

Primary Area: Machine learning for other sciences and fields

Submission Number: 13744

Loading