Ablation Studies on ”TextPSG: Panoptic Scene Graph Generation from Textual Descriptions - ICCV 2023”
Keywords: TextPSG, Panoptic Scene Graph Generation, Computer Vision, Transformers
TL;DR: In this paper, we perform a reproducibility on the paper ”TextPSG: Panoptic Scene Graph Generation from Textual Descriptions - ICCV 2023”.
Abstract: Semantic representation and grouping of objects is an extremely critical in deciphering image scenes. While traditional end-to-end models often employ a top down approach, extracting and segmenting images from pixel annotations, this approach is costly and tedious, leading to limited datasets that are hard to obtain. In contrast, more recent models such as TextPSG aim to eliminate this problem by leveraging large, pre-existing datasets of image-caption pairs in order to generate Panoptic Scene Graphs (PSGs), collecting no pre-existing location priors, explicit links between visual and textual entities, or concept sets. In this work, we aim to reproduce TextPSG's claims in order to determine (1) the ease of reproducibility and (2) perform ablation studies to discover the most impactful parameters of the model.
Submission Number: 2
Loading