Semantic Matching Complements Saliency in Visual Search

NeurIPS 2023 Workshop Gaze Meets ML Submission23 Authors

09 Oct 2023 (modified: 27 Oct 2023)Submitted to Gaze Meets ML 2023EveryoneRevisionsBibTeX
Keywords: Visual search, Saliency model, Semantic matching
Abstract: Searching for a target in natural scenes can be guided by the semantic associations between the target and the scene, as well as the target-orthogonal properties of the scene. To estimate the semantic contributions to search, we analyze a database of human eye movements during visual search in naturalistic images, and construct an image-computable framework based on the CLIP model that exclusively leverages the match between the linguistic and the visual representations of the target and the scene, respectively. While this semantic matching model alone could explain a considerable portion of search behavior, weighting the model with saliency-based models could achieve a better prediction. These results suggest that our overt attention during search is constrained by not only the task at hand but also the task-orthogonal properties of the visual world.
Submission Type: Extended Abstract
Submission Number: 23
Loading