Convex Optimization for Shallow Neural Networks

Tolga Ergen, Mert Pilanci

2019 (modified: 05 Nov 2022)Allerton 2019Readers: Everyone

Abstract: We consider non-convex training of shallow neural networks and introduce a convex relaxation approach with theoretical guarantees. For the single neuron case, we prove that the relaxation preserves the location of the global minimum under a planted model assumption. Therefore, a globally optimal solution can be efficiently found via a gradient method. We show that gradient descent applied on the relaxation always outperforms gradient descent on the original non-convex loss with no additional computational cost. We then characterize this relaxation as a regularizer and further introduce extensions to multineuron single hidden layer networks.

0 Replies