Fast dimension reduction for document classification based on imprecise spectrum analysisOpen Website

Published: 2010, Last Modified: 06 Nov 2023CIKM 2010Readers: Everyone
Abstract: This paper proposes an algorithm called Imprecise Spectrum Analysis (ISA) to carry out fast dimension reduction for document classification. ISA is designed based on the one-sided Jacobi method for Singular Value Decomposition (SVD). To speedup dimension reduction, it simplifies the orthogonalization process of Jacobi computation and introduces a new mapping formula for transforming original document-term vectors. To improve classification accuracy using ISA, a feature selection method is further developed to make inter-class feature vectors more orthogonal in building the initial weighted term-document matrix. Our experimental results show that ISA is extremely fast in handling large term-document matrices and delivers better or competitive classification accuracy compared to SVD-based LSI.
0 Replies

Loading