Keywords: Object detection, deep neural networks, data augmentation, limited data, augmentation policy
Abstract: The recent progress in developing pre-trained models, trained on large-scale datasets, has highlighted the need for robust protocols to effectively adapt them to domain-specific data, especially when there is a limited amount of available data. Data augmentations can play a critical role in enabling data-efficient fine-tuning of pre-trained object detection models. Choosing the right augmentation policy for a given dataset is challenging and relies on knowledge about task-relevant invariances. In this work, we focus on an understudied aspect of this problem -- can bounding box annotations be used to design more effective augmentation policies? Through InterAug, we make a critical finding that, we can leverage the annotations to infer the effective context for each object in a scene, as opposed to manipulating the entire scene or only within the pre-specified bounding boxes. Using a rigorous empirical study with multiple benchmarks and architectures, we demonstrate the efficacy of InterAug in improving robustness, handling data scarcity and being resilient to high background context diversity. Finally, InterAug can be used with any off-the-shelf policy, does not require any modification to the model architecture, and significantly outperforms existing protocols.
Submission Number: 25
Loading