The First Step Towards Voice-Interactive Surgical LLMs

12 Apr 2025 (modified: 12 Apr 2025)MIDL 2025 Short Papers SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Surgical AI, Large Language Models, LLMs, Voice-interactive
TL;DR: Voice-interactive Surgical LLMs
Abstract: Large language models (LLMs) have rapidly advanced healthcare applications such as disease diagnosis; however, their integration into surgical practice remains largely unexplored. One key barrier is the physical constraint inherent in the surgical environment—surgeons’ hands are typically occupied during procedures, rendering traditional input modalities such as keyboards or touch interfaces impractical. In this study, we investigate methods for enabling LLMs to process spoken input and generate verbal responses, thereby facilitating hands-free interaction. We further developed a web-based, code-free prototype of a voice-interactive surgical LLM, accessible to any user with an internet connection. This work establishes a foundational step toward the broader goal of developing operating room–ready surgical AI systems.
Submission Number: 117
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview