Team01-Comparative Analysis of Language Model choices in a Video Search Pipeline for with a focus on Indic Language Content

Indian Institute of Science Summer 2025 DA225o Submission5 Authors

06 Jun 2025 (modified: 24 Jun 2025)Indian Institute of Science Summer 2025 DA225o SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Video Retrieval, Retrieval-Augmented Generation (RAG), Large Language Models (LLMs), Indic Models, Indic Languages, Multilingual Search, Speech-to-Text, Embedding Models
TL;DR: This project compares model choices for a Video RAG pipeline and multilingual user querying for searching Indic Language video content in Tamil, Malayalam, and Hindi.
Abstract:

This project presents a comparative study of different model performance in Retrieval-Augmented Generation (RAG) for Indic language video content. We evaluate models from providers like Sarvam, Google, OpenAI and relevant open models across relevant components in multiple RAG strategies. We implement a RAG Pipeline that processes Indic language videos, has speech-to-text transcription, implements various embedding approaches, and enables multilingual querying via Text or Audio inputs. We also aim to implement advanced Audio/Video Summarization & Search techniques wherever possible. Through systematic evaluation of retrieval accuracy, response quality, and language preservation, this study aims to determine the optimal model and pipeline configuration for Indic video RAG applications, providing insights into the trade-offs between specialized language models and established global alternatives in multilingual video search scenarios. For the purposes of this limited duration study, we will restrict the language choices to Tamil, Malayalam and Hindi.

Submission Number: 5
Loading