Abstract: PTZ cameras offer drones rapid focusing and perspective adjustments, significantly simplifying drone operation. However, manual control of both the drone and camera can increase complexity and error rates. Therefore, our work develops a voice-controlled PTZ camera system for drones, integrating YOLOv8 to assist in target detection, enhancing convenience and efficiency. To achieve precise voice-command execution, the system employs advanced AI models. Whisper converts speech to text with high accuracy, while GPT-3.5 Turbo and LangChain extract key commands for camera control. The system adjusts the pan, tilt, and zoom features of the PTZ camera by utilizing the obtained keywords through the Raspberry Pi. By integrating these technologies, the system delivers a seamless and efficient user experience.
External IDs:dblp:conf/icccn/ChenZRZ25
Loading