AMFT-YOLO: A Adaptive Multi-scale YOLO Algorithm with Multi-level Feature Fusion for Object Detection in UAV Scenes
Abstract: Object detection in images captured by unmanned aerial vehicles (UAVs) often presents challenges including the object is too small to extract effective features, the complex background is easy to produce a lot of noise interference. To address these challenges, we present a novel object detection algorithm, termed Adaptive Multi-Scale Feature Tower-YOLO (AMFT-YOLO). First of all, we propose a special detection head for detecting tiny objects called Task-decoupled Attention Head (TA-head), designed to mitigate the conflict between the requirements of localization and classification tasks on the feature maps. This allows for the efficient utilization of limited feature information. Secondly, we proposed the Bidirectional Multi-Scale Skip Fusion (BMSF) module, addresses the issue of missing information in tiny objects by aggregating global and local features. To solve the noise interference caused by complex backgrounds, we introduce the Dilated Convolutional Spatial Pyramid Pooling (DCSPP) module, which mitigates the effects of cluttered backgrounds through adaptive feature fusion. The effectiveness of our method is validated using the VisDrone and UAVDT datasets. Further ablation experiments with other methods confirm the robustness and adaptability of our method.
Loading