Abstract: We present a new on-device pipeline that efficiently summarizes lecture videos and provides relevant answers directly from a smartphone. We utilize widely accessible tools like OCR and Vosk speech-to-text, coupled with powerful large language models (LLMs), to identify crucial sentences and generate summaries. By harnessing the capabilities of LLMs and the computational power of mobile devices, we fine-tune and quantize BERT and GPT-2 to achieve efficient lecture video summarization and question answering on consumer-grade smartphones like the Pixel 8 Pro. Notably, this approach eliminates the need for cloud APIs, ensuring enhanced user privacy and minimal mobile data usage.https://www.youtube.com/shorts/zwGdONlKays
Loading