Learning from THEODORE: A Synthetic Omnidirectional Top-View Indoor Dataset for Deep Transfer Learning

Abstract: Recent work about synthetic indoor datasets from perspective views has shown significant improvements of object detection results with Convolutional Neural Networks (CNNs). In this paper, we introduce THEODORE: a novel, large-scale indoor dataset containing 100,000 highresolution diversified fisheye images with 16 classes. To this end, we create 3D virtual environments of living rooms, different human characters and interior textures. Beside capturing fisheye images from virtual environments we create annotations for semantic segmentation, instance masks and bounding boxes for object detection. We compare our synthetic dataset to state of the art real-world datasets for omnidirectional images. Based on MS COCO weights, we show that our dataset is well suited for fine-tuning CNNs for object detection and semantic segmentation. Through a high generalization of our models by means of image synthesis and domain randomization we reach a AP up to 0.90 for class person on our own annotated fisheye evaluation suite (FES). Additionally, the evaluation of six classes was done through object detection and semantic segmentation on FES. The segmentation task on FES leads to 0.36 mIoU on all classes and to a mAP of 0.61 for the object detection.
0 Replies
Loading