Photogrammetry and VR for comparing 2D and immersive linguistic data collection (student abstract)

Jacob Rubinstein, Cynthia Matuszek

Published: 15 Jul 2024, Last Modified: 13 Nov 2025AAAI23EveryoneCC BY 4.0

Abstract: The overarching goal of this work is to enable the collection of language describing a wide variety of objects viewed in virtual reality. We aim to create full 3D models from a small number of ‘keyframe’images of objects found in the publicly available Grounded Language Dataset (GoLD) using photogrammetry. We will then collect linguistic descriptions by placing our models in virtual reality and having volunteers describe them. To evaluate the impact of virtual reality immersion on linguistic descriptions of the objects, we intend to apply contrastive learning to perform grounded language learning, then compare the descriptions collected from images (in GoLD) versus our models.