Complex-valued Neurons Can Learn More but Slower than Real-valued Neurons via Gradient Descent

Jin-Hui Wu; Shao-Qun Zhang; Yuan Jiang; Zhi-Hua Zhou

Complex-valued Neurons Can Learn More but Slower than Real-valued Neurons via Gradient Descent

Jin-Hui Wu, Shao-Qun Zhang, Yuan Jiang, Zhi-Hua Zhou

Published: 21 Sept 2023, Last Modified: 01 Jan 2024NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: Complex-valued Neural Networks; Learning Neurons; Real-valued Neural Networks; Convergence Rate

Abstract: Complex-valued neural networks potentially possess better representations and performance than real-valued counterparts when dealing with some complicated tasks such as acoustic analysis, radar image classification, etc. Despite empirical successes, it remains unknown theoretically when and to what extent complex-valued neural networks outperform real-valued ones. We take one step in this direction by comparing the learnability of real-valued neurons and complex-valued neurons via gradient descent. We show that a complex-valued neuron can efficiently learn functions expressed by any one real-valued neuron and any one complex-valued neuron with convergence rate $O(t^{-3})$ and $O(t^{-1})$ where $t$ is the iteration index of gradient descent, respectively, whereas a two-layer real-valued neural network with finite width cannot learn a single non-degenerate complex-valued neuron. We prove that a complex-valued neuron learns a real-valued neuron with rate $\Omega (t^{-3})$, exponentially slower than the $O(\mathrm{e}^{- c t})$ rate of learning one real-valued neuron using a real-valued neuron with a constant $c$. We further verify and extend these results via simulation experiments in more general settings.

Supplementary Material: pdf

Submission Number: 2826

Loading