Exploring the Combined Power of Covariance and Hessian Matrices Eigenanalysis for Binary Classification

Agus Hartoyo; Jan Kazimierz Argasiński; Aleksandra Trenk; Kinga Przybylska; Anna Blasiak; Alessandro Crimi

Exploring the Combined Power of Covariance and Hessian Matrices Eigenanalysis for Binary Classification

Agus Hartoyo, Jan Kazimierz Argasiński, Aleksandra Trenk, Kinga Przybylska, Anna Blasiak, Alessandro Crimi

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: covariance matrix, Hessian matrix, eigenanalysis, binary classification, class separability

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: This paper presents a novel approach that combines eigenanalysis of the Hessian matrix and covariance matrix to achieve optimal class separability in binary classification tasks.

Abstract: Covariance and Hessian matrices have been analyzed separately in the literature for classification problems. However, integrating these matrices has the potential to enhance their combined power in improving classification performance. We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model to achieve optimal class separability in binary classification tasks. Our approach is substantiated by formal proofs that establish its capability to maximize between-class mean distance and minimize within-class variances. By projecting data into the combined space of the most relevant eigendirections from both matrices, we achieve optimal class separability as per the linear discriminant analysis (LDA) criteria. Empirical validation across neural and health datasets consistently supports our theoretical framework and demonstrates that our method outperforms traditional methods. Our method stands out by addressing both LDA criteria, unlike PCA and the Hessian method, which predominantly emphasize one criterion each. This comprehensive approach captures intricate patterns and relationships, enhancing classification performance. Furthermore, through the utilization of both LDA criteria, our method outperforms LDA itself by leveraging higher-dimensional feature spaces, in accordance with Cover's theorem, which favors linear separability in higher dimensions. Our approach sheds light on complex DNN decision-making, rendering them comprehensible within a 2D space.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7389

Loading