Keywords: Complex-valued neural networks, Complex-valued representations, Complex-value CNN, Classification, Complex numbers
TL;DR: We address the contradictory answers present in the literature for the following question: CV-CNN performs better or worse than RV-CNN for classification task?
Abstract: This paper is about complex-valued CNNs (CV-CNNs) for computer vision that use representations that are complex-valued instead of real-valued. We divide input data into three categories: inherently real-valued, inherently complex-valued, and complex-valued obtained by transforming real-valued. We study the question whether complex-valued representation of CV-CNNs offers any advantages over the commonly used real-valued CNNs (RV-CNNs). For concreteness, we focus on the classification task. The existing literature offers contradictory answers to our question. We find that this is mainly because (a) they seldom employ a common performance measure (e.g., CV-CNN compared against RV-CNN with similar network structure vs similar number of parameters) (b) diversity of evaluation datasets used are limited (e.g., datasets in which magnitude information is more, less or as important as phase information) (c) less effort has been devoted to reduce the randomness in training between CV-CNN and RV-CNN. Towards this, we propose performance measures based on similar network structure, number of parameters and number of MAC operations. Also, we consider diverse datasets with varying magnitude/phase information, and deal with the randomness in training. As a result, we expect that any observed performance differences will be independent of the above disparities, and arise from the use of real vs complex representations. Theoretically, we show that, unlike RV-CNNs, CV-CNNs can preserve magnitude and phase through intermediate stages of processing. Our main experimental findings are the following. (1) As network depth decreases --- the performance of CV-CNNs improves with respect to similar network structure; the performances of CV-CNN and RV-CNN having a similar number of parameters become more comparable; and the performance of RV-CNNs improves with respect to similar number of MAC operations; (2) The above performance differences diminish as the network depth increases. (3) With respect to data diversity, performance depends on whether the dataset has dominant magnitude or phase, i.e., whether reconstruction error is lower using only magnitude or only phase. If a complex-valued data has dominant magnitude, instead of providing real and imaginary parts as input, providing the magnitude part produces significant performance gain, whereas if the data has dominant phase, providing both real and imaginary parts is important. This holds true for different network depths.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
4 Replies
Loading