Your Actions Talk: DUET - A Multimodal Dataset for Contextualizable Dyadic Activities

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Dyadic activity datasets, dyadic human activity recognition, contextualization, kinesics
TL;DR: A dyadic human activity datasets that supports contextualization and the extraction of kinesics.
Abstract: Human activity recognition (HAR) has advanced significantly with the availability of diverse datasets, yet the field remains limited by a scarcity of datasets focused on two-person, or ''dyadic,'' interactions. Existing datasets primarily cater to single-person activities, overlooking the complex dynamics and contextual dependencies present in interactions between two individuals. Failing to extend HAR to dyadic settings limits opportunities to advance areas like collaborative learning, healthcare, robotics, augmented reality, and psychological assessments, which require an understanding of interpersonal dynamics. To address this gap, we introduce the Dyadic User Engagement dataseT (DUET), a comprehensive dataset designed to enhance the understanding and recognition of dyadic activities. DUET comprises 14,400 video samples across 12 interaction classes, capturing the highest sample-to-class ratio of dyadic datasets known to date. Each sample is recorded using RGB, depth, infrared, and 3D skeleton joints, ensuring a robust dataset for multimodal analysis. Critically, DUET features a taxonomization of interactions based on five fundamental communication functions: emblems, illustrators, affect displays, regulators, and adaptors. This classification, rooted in psychology, supports dyadic human activity contextualization by extracting the embedded semantics of bodily movements. Data collection was conducted at three locations using a novel technique that captures interactions from multiple views with a single camera, thereby improving model resilience against background noise and view variations. We benchmark six state-of-the-art, open-source HAR algorithms on DUET, demonstrating the dataset's complexity and current HAR models' limitations in recognizing dyadic interactions. Our results highlight the need for further research into multimodal and context-aware HAR for dyadic interactions, and provide a dataset to support this advancement. DUET is publicly available at \href{https://huggingface.co/datasets/Anonymous-Uploader1/DUET}, providing a valuable resource for the research community to advance HAR in dyadic settings.
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10431
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview