Abstract: Visual scene understanding has been one of the major goals of computer vision. However, existing work has focused on the object-level understanding, which limits the visual questions that can be answered. The goal of this paper is to invite collective efforts for entity-level understanding of images, by releasing ECO datasets and baselines for this task.
0 Replies
Loading