Missing Person Recognition Algorithms Based on Image Captioning and Visual Grounding

Ayeong Jeong, Yeongju Woo, Han-Young Kim, Gayun Suh, Chae-Yeon Heo, Yeong-Jun Cho, Hieyong Jeong

Published: 2024, Last Modified: 31 Jan 2025ICPR (33) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Missing person searches are a critical societal challenge with significant implications for public safety and welfare. This study proposes two novel algorithms for efficient and rapid missing person detection based on video data. The first algorithm, CaptionMP, uses image captioning technology to generate descriptions of individuals’ appearances in video footage, comparing these descriptions to missing person information. The second algorithm, DINOMP, employs visual grounding techniques to detect characteristics of missing persons within video streams directly via text prompts. Both algorithms were fine-tuned using the MALS dataset and demonstrated performance across diverse environmental conditions. Notably, they exhibited robust detection capabilities in low-light environments and with complex clothing patterns. The results showed that our proposed methods have considerable potential in the field of missing person detection, offering a solution to the limitations of traditional pedestrian attribute recognition(PAR) methods. This research is expected to substantially contribute to enhancing the practical applicability of intelligent CCTV systems in missing person searches.