Visual Graphs from Motion (VGfM): Scene understanding with object geometry reasoning

Paul Gay, James Stuart, Alessio Del Bue

17 Feb 2020OpenReview Archive Direct UploadReaders: Everyone

Abstract: Recent approaches on visual scene understanding attemptto build a scene graph – a computational representation of objects andtheir pairwise relationships. Such rich semantic representation is veryappealing, yet difficult to obtain from a single image, especially whenconsidering complex spatial arrangements in the scene. Differently, animage sequence conveys useful information using the multi-view geomet-ric relations arising from camera motions. Indeed, object relationshipsare naturally related to the 3D scene structure. To this end, this paperproposes a system that first computes the geometrical location of objectsin a generic scene and then efficiently constructs scene graphs from videoby embedding such geometrical reasoning. Such compelling representa-tion is obtained using a new model where geometric and visual featuresare merged using an RNN framework. We report results on a dataset wecreated for the task of 3D scene graph generation in multiple views.

0 Replies