EgoLink: Egocentric Language-Vision Interactive Network Knowledge Challenge

Yueying Feng; Bohan Yu; Renhe Sun

EgoLink: Egocentric Language-Vision Interactive Network Knowledge Challenge

Yueying Feng, Bohan Yu, Renhe Sun

Published: 03 Apr 2026, Last Modified: 03 Apr 2026ACMMM2026-MGC-ProposalEveryoneRevisionsCC BY 4.0

Keywords: Egocentric Social Reasoning, First-Person Video Understanding, Embodied AI, Causal Inference, Multimodal Large Language Models

TL;DR: An egocentric video reasoning challenge evaluating social intelligence through multi-dimensional causal and intent inference.

Abstract: The Egocentric Language-Vision Interactive Network Knowledge(EgoLink) Challenge redefines the cognitive boundaries of embodied agents in social contexts. While Embodied AI ultimately aims to perceive and interact from an egocentric perspective, current research predominantly emphasizes physical navigation while neglecting deep social understanding. EgoLink introduces a large-scale, real-world egocentric benchmark that employs a multi-dimensional Multiple-Choice Question(MCQ) format to evaluate models' reasoning capabilities across emotions, causal logic, and behavioral intents in human interactions. This challenge bridges the gap between perception and social cognition, advancing Embodied AI toward socially-aware general intelligence.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 10

Loading