Abstract: The problem of visual navigation is the poor generalization to find the given target object in unexplored environment without the help of auxiliary sensors. We propose solving the visual navigation problem by incorporating object spatial scene priors and visible object relational reasoning. To get more accurate ground truth environment priors, we construct specific scene graph priors for indoor navigation, which provides rich object spatial relationships for helping finding the target objects by object relation detection. Furthermore, to imitate human’s reasonability in searching objects, we encode our scene graph priors with Markov model for relational reasoning and fuse them into reinforcement learning policy network, which improves model generalization in novel scenes. Moreover, we perform experiments on the AI2THOR virtual environment and outperform the current most state-of-the-art both in SPL (Success weighted by Path Length) and success rate on average.
0 Replies
Loading