Towards unified multi-view ensemble models for multi-label podcast genre prediction

Yashwant Pravinrao Bangde, Naveen Saini, Vikas Kumar Tiwari

Published: 2026, Last Modified: 18 Mar 2026J. Supercomput. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the rapid growth of podcast content and the diversity of topics covered, accurately classifying podcasts into multiple genres is crucial for improving content discovery and recommendation systems. This paper presents a novel multi-view ensemble framework for multi-label genre classification of English-language podcasts. Our approach integrates sentence-level and keyword-level representations to effectively capture both narrative structures and thematic content. Using transformer-based models such as MPNet and DistilRoBERTa, we generate robust embeddings that feed into multiple classification algorithms, including Support Vector Machines, Multinomial Naive Bayes, and Logistic Regression. The proposed framework requires high-performance computing (HPC) due to the large-scale dataset, multiple transformer-based embeddings, and an ensemble of diverse classifiers. We explore both single-view and multi-view settings, developing ensemble strategies to enhance classification performance. Experiments are conducted on a curated dataset of 10,000 English-language podcast descriptions spanning 67 genres, sourced from iTunes. This dataset’s breadth ensures our framework addresses the complex, multi-genre nature of podcast content. Performance is evaluated using macro and weighted F-measure scores, among other metrics. Our multi-view ensemble model demonstrates improvements over the best-performing single-view (sentence-based) ensemble, increasing the macro F-measure from 0.4529 to 0.5337 and the weighted F-measure from 0.5549 to 0.5595. These results highlight the effectiveness of combining multiple views and classifiers for robust genre prediction. We further conduct qualitative analysis to gain deeper insights into our model’s predictions and genre correlations. Overall, our framework not only enhances podcast classification performance but also contributes to building more effective content discovery tools, paving the way for personalized and engaging listening experiences in the dynamic podcasting ecosystem. The code related to this work is publicly available at our GitHub (https://github.com/yashwantbangde/Towards-Unified-Multi-View-Ensemble-Models-for-Multi-Label-Podcast-Genre-Prediction) repository.

External IDs:dblp:journals/tjs/BangdeST26