Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes

Erica Zhang; Fangzhao Zhang; Mert Pilanci

Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes

Erica Zhang, Fangzhao Zhang, Mert Pilanci

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Active learning methods aim to improve sample complexity in machine learning. In this work, we investigate an active learning scheme via a novel gradient-free cutting-plane training method for ReLU networks of arbitrary depth and develop a convergence theory. We demonstrate, for the first time, that cutting-plane algorithms, traditionally used in linear models, can be extended to deep neural networks despite their nonconvexity and nonlinear decision boundaries. Moreover, this training method induces the first deep active learning scheme known to achieve convergence guarantees, revealing a geometric contraction rate of the feasible set. We exemplify the effectiveness of our proposed active learning method against popular deep active learning baselines via both synthetic data experiments and sentimental classification task on real datasets.

Lay Summary: When training AI systems, it's often expensive and time-consuming to collect labeled data. Active learning helps by letting the model choose which examples it wants to learn from, reducing the number of labels needed. In this work, we propose a new way to train deep neural networks that doesn't rely on gradients—the usual way most models learn—but instead uses a method inspired by cutting away infeasible answers (a strategy known as cutting-plane optimization) until only the best ones remain. While this approach has long been used for simpler models, we are the first to show it can work for deep networks too. Even more exciting, our method is the first of its kind to offer theoretical guarantees: we can prove that it will steadily get closer to the correct decision. This is something current deep active learning algorithms cannot do. We test our method on both simple and real-world tasks, showing it performs well even compared to popular active learning techniques used in deep learning today. This opens the door to new ways of understanding and improving how deep learning models learn from limited data.

Link To Code: https://github.com/pilancilab/cpal

Primary Area: Optimization->Convex

Keywords: convex optimization, cutting plane method, active learning

Submission Number: 13191

Loading