Characterizing young children’s everyday activities using video question-answering models

Published: 23 Sept 2025, Last Modified: 06 Dec 2025DBM 2025 Findings PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Children are remarkably efficient learners compared to our most advanced computational models of learning. One key difference is that children seem to leverage regularities in the activities (e.g., $\textit{eating}$) in which they participate to learn about words or objects (e.g., "pomegranate"), even under skewed, long-tailed distributions. While everyday activities have long been theorized to be important as supports for children's learning, our understanding of the types, frequencies, and rhythms of these activities has been out of reach due to both a lack of naturalistic video datasets and the necessity for manual annotations. Here, we use the recent release of a large, egocentric dataset of children's everyday experience (BabyView) ($N$=31 children, $N$=868 hours) and capitalize on innovations in video question-answering (VideoQA) models to quantify the $\textit{what}$ and $\textit{where}$ of children's everyday experiences. Using these models, we classify both the activities (e.g., $\textit{eating, dancing, exploring}$) and physical locations (e.g., $\textit{living room, garage}$) in the infant view and to generate natural-language descriptions for contiguous 10-second videos across the entire dataset. We provide convergent validity for our classifications by recovering expected trends (e.g., high frequency of $\textit{play}$ in the $\textit{living room}$ in this dataset). Further, our analyses highlight the variability in children's everyday activities across locations and across time. Compared with prior work analyzing static image content, our work highlights the advances possible by using VideoQA models to analyze the dynamic nature of children's experiences. A better understanding of how children learn in everyday contexts should inform developmentally-inspired models of early learning and cognitive development.
Length: long paper (up to 8 pages)
Domain: methods
Author List Check: The author list is correctly ordered and I understand that additions and removals will not be allowed after the abstract submission deadline.
Anonymization Check: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and URLs that point to identifying information.
Submission Number: 50
Loading