TOAM-YOLO: A Tiny Object-Aware Multi-Expert YOLO Framework for Diverse Domains

20 Apr 2026 (modified: 23 Apr 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: YOLO-based object detection models have advanced significantly over the years through continuous architectural refinements and subsequent performance improvements. Tiny object detection remains a challenging task due to several constraints posed by progressive downsampling in model architectures and the smaller footprint of tiny objects in high-resolution images. This challenge is faced in diverse applications such as maritime surveillance, aerial surveillance, and in medical applications such as microscopic blood cell analysis. In this study, we introduce and demonstrate that a novel multi-domain expert, which we refer to as TOA-MoE (Tiny object aware mixture of experts), consisting of a Hessian-based curvature expert and a Fourier-based frequency expert, along with a 3-level attention mechanism, substantially improves the detection performance of YOLO models while only increasing the learnable parameters by a small fraction. Additionally, we add a feature fusion network that incorporates a BiFPN-style structure and integrates deformable convolutional layer modules in the architecture, and we replace the standard up-sampling layers with a Content-Aware Reassembly of Features (CARAFE) module to preserve fine-grained feature details during feature map expansion. We systematically demonstrate the plug-and-play capability of these changes on YOLOv11 and YOLOv12 models. Tiny object aware Mixture of experts based YOLO (TOAM-YOLO) achieves state-of-the-art performance on five datasets: three tiny object benchmarks (SeaPerson, TinyPerson, VisDrone) with mAP@0.5 improvements of 11.6%, 3.3%, and 10% respectively, and two blood cell datasets (BCCD, CBC) with mAP@0.5:0.95 improvements of 3.9% and 1.7% for platelet detection, all over YOLOv12n, while adding only 0.75M parameters.
Submission Type: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=HbkSIDGyej&nesting=2&sort=date-desc
Changes Since Last Submission: Changes that have been made since the last submission, based on the valuable inputs from Editors-in-Chief, are as follows: 1. Structural changes: We have made changes in the overall structure of the paper by moving the figures and tables to more natural places in the paper. 2. Figure and Table Citations: We have improved the readability of the paper by citing the figures and tables properly in the main text as well as in the appendix. All ambiguities in figure and table citations that were present in the previous version have been fixed in this version. 3. Writing: We have revised some sentences to be clearer in explanation, and we have fixed some punctuation errors in the manuscript.
Assigned Action Editor: ~Yaoyao_Liu1
Submission Number: 8517
Loading