Visual Question Answering Driven Eye Tracking Paradigm for Identifying Children with Autism Spectrum Disorder

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As a non-contact method, eye-tracking data can be used to diagnose people with Autism Spectrum Disorder (ASD) by comparing the differences of eye movements between ASD and healthy people. However, existing works mainly employ a simple free-viewing paradigm or visual search paradigm with restricted or unnatural stimuli to collect the gaze patterns of adults or children with an average age of 6-to-8 years, hindering the early diagnosis and intervention of preschool children with ASD. In this paper, we propose a novel method for identifying children with ASD in three unique features: First, we design a novel eye-tracking paradigm that records Visual Question Answering (VQA) driven gaze patterns in complex natural scenes as a powerful guide for differentiating children with ASD. Second, we contribute a carefully designed dataset, named VQA4ASD, for collecting VQA-driven eye-tracking data from 2-to-6-year-old ASD and healthy children. To the best of our knowledge, this is the first dataset focusing on the early diagnosis of preschool children, which could facilitate the community to understand and explore the visual behaviors of ASD children; Third, we further develop a VQA-guided cooperative ASD screening network (VQA-CASN), in which both task-agnostic and task-specific visual scanpaths are explored simultaneously for ASD screening. Extensive experiments demonstrate that the proposed VQA-CASN achieves competitive performance with the proposed VQA-driven eye-tracking paradigm. The code and dataset will be publicly available.
Primary Subject Area: [Engagement] Emotional and Social Signals
Relevance To Conference: This work proposes a novel visual question and answer (VQA-driven) paradigm that combines image and question-answering for ASD diagnosis. Compared to the image free-viewing paradigm published on MMSys ’19, it can reveal the interaction ability of subject and improve the diagnostic performance. Moreover, this work construct a dataset which can largely enrich the studies of automatic computer-aided diagnosis and multimedia applications.
Supplementary Material: zip
Submission Number: 2948
Loading