Systematic Evaluation of Language Characteristic and Data Enrichment in Text-based Person Search

Published: 2024, Last Modified: 15 Nov 2024MAPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Text-based Person Search (TBPS) has emerged as a significant research topic in information retrieval domain, garnering considerable attention and development thanks to its wide potential applications. While some existing research on TBPS has achieved notable milestones in the English language, many proposed models often exhibit low efficiency when extended to other low-resource languages, such as Vietnamese due to the difference of language characteristics. In this paper for the first time, the role of language characteristics and data enrichment on the effectiveness of TBPS is fully assessed. To this end, two state of the art models for TBPS in English ViTAA and IRRA are chosen and adapted to TBPS in Vietnamese. Two benchmark datasets CUHK-PEDES for English and 300VnPersonsearch for Vietnamese, are employed in our experiments. Experimental results show that evaluating systems within the same domain yields significantly higher results than evaluating across different domains, increasing by 30.66% for the ViTAA model and 22.03% for the IRRA model at R@1. The obtained results provide valuable insights into the influence of language characteristics and data enrichment on the effectiveness of TBPS models.
Loading