Scene Graph Parsing via Abstract Meaning Representation in Pre-trained Language Models

Woo Suk Choi; Yu-Jung Heo; Dharani Punithan; Byoung-Tak Zhang

Scene Graph Parsing via Abstract Meaning Representation in Pre-trained Language Models

Woo Suk Choi, Yu-Jung Heo, Dharani Punithan, Byoung-Tak Zhang

Published: 08 Jun 2022, Last Modified: 05 May 2023DLG4NLP 2022 OralReaders: Everyone

Keywords: semantic parsing, textual scene graph parsing, abstract meaning representation

TL;DR: To parse scene graph (i.e. high-level semantics) from textual descriptions of images, we propose the application of abstract meaning representation (AMR) with pre-trained language models to

Abstract: In this work, we propose the application of abstract meaning representation (AMR) based semantic parsing models to parse textual descriptions of a visual scene into scene graphs, which is the first work to the best of our knowledge. Previous works examined scene graph parsing from textual descriptions using dependency parsing and left the AMR parsing approach as future work since sophisticated methods are required to apply AMR. Hence, we use pre-trained AMR parsing models to parse the region descriptions of visual scenes (i.e. images) into AMR graphs and pre-trained language models (PLM), BART and T5, to parse AMR graphs into scene graphs. The experimental results show that our approach explicitly captures high-level semantics from textual descriptions of visual scenes, such as objects, attributes of objects, and relationships between objects. Our textual scene graph parsing approach outperforms the previous state-of-the-art results by 9.3\% in the SPICE metric score.

0 Replies

Loading