Sequence-Based Identification of First-Person Camera Wearers in Third-Person Views

Published: 13 May 2026, Last Modified: 13 May 2026CV4Edu - Computer Vision for Education (CVPR 2026)EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Immersive learning, virtual reality, egocentric vision, dataset
Abstract: As immersive education and collaborative environments continue to advance, understanding complex student interactions within these shared spaces has become increasingly important. While large-scale datasets like Ego4D and Ego-Exo4D have advanced egocentric vision research, they lack the rich, multi-user interactions critical for collaborative learning and robotics. To address this gap, we introduce TF2025, an expanded dataset featuring synchronized first- and third-person views of actors, enhanced synchronization, and multiple train-test splits. We also propose a sequence-based approach for identifying first-person camera wearers in broader third-person views. By leveraging motion cues and person re-identification, our method improves robustness and significantly outperforms state-of-the-art approaches. This work advances the analysis of multi-camera interactions in embodied vision and education. The code and dataset will be made publicly available upon acceptance.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Track: Proceeding Track
Submission Number: 6
Loading