LLM-DETR: An Enhanced DETR with a Large Language Model-Inspired Attention Mechanism for Object Detection

Kuan-Hsien Liu, Mingru Wang, Tsung-Jung Liu

Published: 2025, Last Modified: 21 Apr 2026SMC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We propose LLM-DETR, an enhanced version of the DETR (DEtection TRansformer) framework that integrates advanced attention mechanisms inspired by large language models (LLMs) for object detection. Applied to the MS COCO 2017 dataset, LLM-DETR demonstrates a 5% increase in average precision (AP) over the original DETR, while exhibiting efficient GPU training. This improvement underscores the potential of LLM-inspired attention mechanisms for advancing object detection accuracy in various domains. The code for our LLM-DETR is publicly available on GitHub: https://github.com/mingruWang/LLM-DETR.
Loading