UoI-NMF Cluster: A Robust Nonnegative Matrix Factorization Algorithm for Improved Parts-Based Decomposition and Reconstruction of Noisy DataDownload PDFOpen Website

2017 (modified: 07 Nov 2022)ICMLA 2017Readers: Everyone
Abstract: With the ever growing collection of large volumes of scientific data, development of interpretable machine learning tools to analyze such data is becoming more important. However, robust, interpretable machine learning tools are lacking, threatening extraction of scientific insight and discovery. Nonnegative Matrix Factorization (NMF) algorithms decompose an m × n nonnegative data matrix A into a k × n basis matrix H and an m × k weight matrix W, such that A ≈ WH, where k is the desired rank. In this paper, we present a novel two stage algorithm, UoI-NMF <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">cluster</sub> for NMF, which is based on three innovations: (i) completely separate bases learning from weight estimation, (ii) learn bases by clustering NMF results across bootstrap resamples of the data, and (iii) use the recently introduced Union of Intersections (UoI) framework to estimate ultra-sparse weights that maximize data reconstruction accuracy. We deploy our algorithm on various synthetic and scientific data to illustrate its performance, with a focus on neuroscience data. Compared to other NMF algorithms, UoI-NMF <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">cluster</sub> yields: a) more accurate parts-based decompositions of noisy data, b) a sparse and accurate weight matrix, and c) high accuracy reconstructions of the de-noised data. Together, these improvements enhance the performance and interpretability of NMF application to noisy data, and suggest similar approaches may benefit other matrix decomposition algorithms.
0 Replies

Loading