Organ-DETR: 3D Organ Detection Transfomer with Multiscale Attention and Dense Query Matching

21 Sept 2023 (modified: 25 Feb 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Organ Detection, Representation Learning, DEtection TRansformer (DETR), Attention, Transformer, One-to-Many Matching, One-to-One Matching, Segmentation
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: This study introduces Organ-DETR with two novel modules, named MultiScale Attention (MSA) and Dense Query Matching (DQM), for boosting the performance of DEtection TRansformers (DETRs) for 3D organ detection.
Abstract: Query-based Transformers have been yielding impressive results in object detection. The potential of DETR-like methods for 3D data, especially in volumetric medical imaging, remains largely unexplored. This study presents Organ-DETR that contains two novel modules, MultiScale Attention (MSA) and Dense Query Matching (DQM), for boosting the performance of DEtection TRansformers (DETRs) for 3D organ detection. MSA introduces a novel top-down representation learning approach for efficient encoding of 3D visual data. MSA has a multiscale attention architecture that leverages dual self-attention and cross-attention mechanisms to provide the most relevant features for DETRs. It aims to employ long- and short-range spatial interactions in the attention mechanism, leveraging the self-attention module. Organ-DETR also introduces DQM, an approach for one-to-many matching that tackles the difficulties in detecting organs. DQM increases positive queries for enhancing both recall scores and training efficiency without the need for additional learnable parameters. Extensive results on five 3D Computed Tomography (CT) datasets indicate that the proposed Organ-DETR outperforms comparable techniques by achieving a remarkable improvement of +10.6 mAP COCO and +10.2 mAR COCO. Code and pre-trained models are available at \url{https://---}.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3769
Loading