Evidential Representation Proposal for Predicate Classification Output Logits in Scene Graph Generation

Published: 01 Jan 2024, Last Modified: 31 Oct 2024HCI (51) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: A scene graph consists of a collection of triplets < subject, predicate, object > for describing an image content. One challenging problem in Scene Graph Generation (SGG) is that annotators tend to give poorly relevant predicates, which causes a bias toward less informative triplet predictions. This paper focuses on predicate classification task. We question the information processing that leads to the deduction of poorly informative predicates in current models. We argue that the set of possible predicates should not be regarded as a probability space notably because the predicates granularity varies, like on and \(sitting \; on\). We suggest an alternative representation of the information in the Dempster-Shafer framework using a goal-oriented constructed hierarchy. Thanks to this more trustworthy representation, we propose a flexible decision-making procedure that allows us to play with the predicted predicate level of granularity. Our experiments, carried out using scores estimated by an existing transformer-based scene graph generation model, show that our method helps reduce the long tail problem.
Loading