Keywords: deep learning, neural collapse, unconstrained features, error bound, linear convergence
Abstract: In this work, we study the recently discovered neural collapse (NC) phenomenon, which is prevalent in training over-parameterized deep neural networks for classification tasks. Existing work has shown that any optimal solution of the trained problem for classification tasks is an NC solution and has a benign landscape under the unconstrained feature model. However, these results do not provide an answer to the question of how quickly gradient descent can find an NC solution. To answer this question, we prove an error bound property of the trained problem, which refers to the inequality that bounds the distance of a point to the optimal solution set by the norm of its gradient, under the unconstrained feature model. Using this error bound, we show linear convergence of gradient descent for finding an NC solution.