A generic audio classification and segmentation approach for multimedia indexing and retrieval

Serkan Kiranyaz, Ahmad Farooq Qureshi, Moncef Gabbouj

Published: 2006, Last Modified: 06 Mar 2025IEEE Trans. Speech Audio Process. 2006EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We focus the attention on the area of generic and automatic audio classification and segmentation for audio-based multimedia indexing and retrieval applications. In particular, we present a fuzzy approach toward hierarchic audio classification and global segmentation framework based on automatic audio analysis providing robust, bi-modal, efficient and parameter invariant classification over global audio segments. The input audio is split into segments, which are classified as speech, music, fuzzy or silent. The proposed method minimizes critical errors of misclassification by fuzzy region modeling, thus increasing the efficiency of both pure and fuzzy classification. The experimental results show that the critical errors are minimized and the proposed framework significantly increases the efficiency and the accuracy of audio-based retrieval especially in large multimedia databases.