ConfLab: A Data Collection Concept, Dataset, and Benchmark for Machine Analysis of Free-Standing Social Interactions in the Wild
Keywords: Social Human Behavior, In-the-Wild Dataset, Free-standing Conversations
Abstract: Recording the dynamics of unscripted human interactions in the wild is challenging due to the delicate trade-offs between several factors: participant privacy, ecological validity, data fidelity, and logistical overheads. To address these, following a 'datasets for the community by the community' ethos, we propose the Conference Living Lab (ConfLab): a new concept for multimodal multisensor data collection of in-the-wild free-standing social conversations. For the first instantiation of ConfLab described here, we organized a real-life professional networking event at a major international conference. Involving 48 conference attendees, the dataset captures a diverse mix of status, acquaintance, and networking motivations. Our capture setup improves upon the data fidelity of prior in-the-wild datasets while retaining privacy sensitivity: 8 videos (1920x1080, 60 fps) from a non-invasive overhead view, and custom wearable sensors with onboard recording of body motion (full 9-axis IMU), privacy-preserving low-frequency audio (1250 Hz), and Bluetooth-based proximity. Additionally, we developed custom solutions for distributed hardware synchronization at acquisition, and time-efficient continuous annotation of body keypoints and actions at high sampling rates. Our benchmarks showcase some of the open research tasks related to in-the-wild privacy-preserving social data analysis: keypoints detection from overhead camera views, skeleton-based no-audio speaker detection, and F-formation detection.
Author Statement: Yes
TL;DR: We propose ConfLab (Conference Living Lab) as a new concept for in-the-wild recording of real-life social human behavior, and provide a dataset from the first edition of ConfLab at a major international conference.
URL: https://doi.org/10.4121/c.6034313
Open Credentialized Access: Our data involves human participants and requires the signing of an End User License Agreement (available at https://doi.org/10.4121/20016194). A signed copy of the EULA needs to be sent to "SPCLabDatasets-insy@tudelft.nl". Once the request is approved by a member of the TUDelft Human Research Ethics Committee or a member of staff, private access links to download the parts of the dataset under embargo will be emailed to the requester. The process is also described on the main landing page of the dataset.
Dataset Url: The dataset is available at: https://doi.org/10.4121/c.6034313
The End User License Agreement (EULA) and Datasheet are public, while the components involving data from the human participants are under restricted access. To access these, a requester must send a signed copy of the EULA (available at https://doi.org/10.4121/20016194) to "SPCLabDatasets-insy@tudelft.nl". Once the request is approved by a member of the TUDelft Human Research Ethics Committee or a member of the administrative staff, private access links to download these parts of the dataset under embargo will be emailed to the requester. The process is also described on the main landing page of the dataset.
For the review process, these requests (or the EULAs) are not sent to us authors. So the identities of the reviewers and the single-blinded nature of the review process are protected.
License: The dataset itself is available under restricted access defined by an End-User License Agreement (EULA). The EULA itself is available under a CC0 license. The code (https://github.com/TUDelft-SPC-Lab/conflab) for the benchmark baseline tasks, and the schematics and data associated with the design of our custom wearable sensor called the Midge (https://github.com/TUDelft-SPC-Lab/spcl_midge_hardware) are available under the MIT License. The Licenses are all specified in their respective repositories.
Supplementary Material: pdf
Contribution Process Agreement: Yes
In Person Attendance: Yes
32 Replies
Loading