Abstract: Highlights•We propose a model-agnostic network, which makes better use of external texts.•We solve zero-shot and long-tail problem of rare relations in scene graph generation.•Our method performs better than baseline methods and other text-supervised methods.
Loading