Decoding multiculturalism through linguistic landscapes: a deep learning-based OCR analysis of street view images

Hyebin Kim, Eunseon Seong, Harim Lee, Dong-Kyu Chae, Sugie Lee

Published: 2025, Last Modified: 04 Oct 2025Urban Inform. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Understanding multiculturalism is essential when analyzing the spatial and cultural dynamics of globalized urban environments. This study examines Seoul’s linguistic landscapes using a novel framework that integrates large-scale street view image (SVI) datasets, an enhanced deep learning–based optical character recognition (OCR) model, and geospatial analytics. By leveraging the SVI dataset within an OCR detection and recognition framework, the research identifies language distribution patterns and their cultural significance at the street level. The findings indicate that most of the detected signs are in Korean, followed by English and Chinese. Furthermore, Korean dominates traditional markets, reflecting local lifestyles, whereas English signifies modernity in tourist and luxury areas. Chinese is predominantly clustered in immigrant neighborhoods, highlighting community dynamics. This study proposes a scalable, quantitative framework combining open-source technologies and language proportion–based analyses and demonstrates its contextual validity and applicability to multilingual urban environments. The approach advances linguistic landscape research, offering insights into cultural identity and social dynamics, and it has policy implications for promoting integration in multicultural societies.