Robust Latent Factor Analysis for Precise Representation of High-Dimensional and Sparse Data

Published: 01 Jan 2021, Last Modified: 30 Sept 2024IEEE CAA J. Autom. Sinica 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: High-dimensional and sparse (HiDS) matrices commonly arise in various industrial applications, e.g., recommender systems (RSs), social networks, and wireless sensor networks. Since they contain rich information, how to accurately represent them is of great significance. A latent factor (LF) model is one of the most popular and successful ways to address this issue. Current LF models mostly adopt L 2 -norm-oriented Loss to represent an HiDS matrix, i.e., they sum the errors between observed data and predicted ones with L 2 -norm. Yet L 2 -norm is sensitive to outlier data. Unfortunately, outlier data usually exist in such matrices. For example, an HiDS matrix from RSs commonly contains many outlier ratings due to some heedless/malicious users. To address this issue, this work proposes a smooth L 1 -norm-oriented latent factor (SL-LF) model. Its main idea is to adopt smooth L 1 -norm rather than L 2 -norm to form its Loss, making it have both strong robustness and high accuracy in predicting the missing data of an HiDS matrix. Experimental results on eight HiDS matrices generated by industrial applications verify that the proposed SL-LF model not only is robust to the outlier data but also has significantly higher prediction accuracy than state-of-the-art models when they are used to predict the missing data of HiDS matrices.
Loading