Abstract: As fundamental and important problems in computer vision field, semantic segmentation and object detection have made a series of breakthroughs in recent years. Although the existing semantic segmentation and object detection methods have achieved impressive performance in some detection benchmarks, they only focus on local information near the region of objects. However, an image usually contains rich semantic information, including scene context information and dependency information between objects. As a result, ignoring this semantic information will inevitably deteriorate their performance. In this paper, we propose a novel network named joint semantic segmentation and object detection based on relational Mask R-CNN (RM-RCNN) to solve above limitations. By designing the object dependence calculation module (DCM), we can model the relationship information between objects by their geometric and appearance features, so as to improve the accuracy of semantic segmentation and object detection. At the same time, we also design a cross-scale information transmission module (CSITM), which can make the features of different levels transmit information to each other. By using CSITM, our method can effectively retain the useful information and discard the useless information to further improve its performance. Experiments on two benchmark datasets demonstrate the effectiveness of our proposed network.
0 Replies
Loading