Abstract: Weakly-supervised object localization only depends on image-level labels to obtain object locations and attracts more attention recently. Taking inspiration from the human visual mechanism that human searches and localizes the region of interest by shrinking the view from a wide range and ignoring the unrelated background gradually, we propose a novel weakly-supervised localization method of cutting background of an object iteratively to achieve object localization with deep reinforcement learning. This approach can train an agent as a detector, which searches through the image and tries to cut off all regions unrelated to classification performance. An effective refinement approach is also proposed, which generates a heat-map by sum-pooling all feature maps to refine the location cropped by the agent. As a result, by combining the top-down cutting process and the bottom-up evidence for refinement, we can achieve a good performance on object localization in only several steps. To the best of our knowledge, this may be the first attempt to apply deep reinforcement learning to weakly-supervised object localization. We perform our experiments on PASCAL VOC dataset and the results show our method is effective.
0 Replies
Loading