Abstract: Podcasts are now a favored means of consuming audio content across various languages, yet the sheer volume of available podcasts poses a challenge for users in discovering content that matches their preferences. In this study, we propose unsupervised techniques for podcast recommendation in a multi-lingual context. Our novel approach integrates multi-view representations, encompassing both sentence-level and keyword-level perspectives, to capture the diverse facets of podcast content. Utilizing autoencoders, we derive meaningful latent representations for both sentence and keyword views from the podcast dataset. These representations encapsulate semantic relationships crucial for subsequent clustering analysis. Then we employ unsupervised learning algorithms on these learned representations to cluster similar podcasts together followed by ensemble learning. Our experimental evaluations on a varied multi-lingual podcast dataset (Hindi and English) showcase the promising performance of our approach in terms of podcast recommendation accuracy and user satisfaction. By leveraging multiple views and unsupervised learning techniques, we effectively address the challenges posed by language diversity and content heterogeneity in podcast recommendation systems. The results reveals that the ensemble algorithm emerges as a standout performer, achieving a silhouette score, diversity and coverage of (a) 0.6640, 0.6150, and 0.7720 for the English Dataset, and (b) 0.4860, 0.4620, and 0.8380 for the Hindi Dataset, outperforming other methods. These results underscore the efficacy of our multi-view ensemble clustering approach in tackling language diversity and content heterogeneity, thus advancing personalized podcast recommendation in the multi-lingual domain.
External IDs:dblp:conf/icpr/BangdeS24
Loading