Abstract: Manually generating annotated bounding boxes for object detection is time consuming. Although human-annotation is the most accurate approach, machine learning models can provide additional assistance. In this paper, we propose a human in a loop automatic image labeling framework focusing on aerial images with less features for detection. The proposed model consists of two main parts, prediction model and adjustment model. The user first provides click location to prediction model to generate a bounding box of a specific object. The bounding box is then fine-tuned by the adjustment model for more accurate size and location. A feedback and retrain mechanism is implemented that allows the users to manually adjust the generated bounding box and provide feedback to incrementally train the adjustment network during runtime. This unique online learning feature enables user to generalize existing model to target classes not initially presented in the training set, and gradually improves the specificity of the model to those new targets online. We demonstrate promising results on Neovision 2 Heli dataset. Compared to the state-of-the-art method, our prediction model achieves a higher detection rate, and our adjustment model improves the IOU by up to 45%.
Loading