Abstract: Online news articles encompass a variety of modalities such as text and images. How can we learn a representation that incorporates information from all those modalities in a compact and interpretable manner? In this paper, we propose CITEM (Compact Interpretable Tensor graph multi-modal news EMbedding), a tensor-based framework for compact and interpretable multi-modal news representations. CITEM generates a tensor graph consisting of a news similarity graph for each modality and employs a tensor decomposition to produce compact and interpretable embeddings, each dimension of which is a heterogeneous co-cluster of news articles and corresponding modalities. We extensively validate CITEM compared to baselines on two news classification tasks: misinformation news detection and news categorization. The experimental results show that CITEM performs within the same range of AUC as state-of-the-art baselines while producing 7x to 10.5x more compact embeddings. In addition, each embedding dimension of CITEM is interpretable, representing a latent co-cluster of articles.
Loading