Abstract: Highlights•Propose the neighboring region attention mechanism for token interaction.•Integrate token interactions into the detection head to enhance contrastive alignment.•The model improves open-vocabulary detection performance across benchmarks.
External IDs:dblp:journals/eaai/QiangLLLHP25
Loading