Segmentation From Attention: Training-Free Layer Selection and One-Shot Tuning for Segmentation in VLMs

Mir Rayat Imtiaz Hossain; Mennatullah Siam; Leonid Sigal; James J. Little

Segmentation From Attention: Training-Free Layer Selection and One-Shot Tuning for Segmentation in VLMs

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal, James J. Little

Published: 09 Feb 2026, Last Modified: 09 Feb 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large-scale vision-language models (VLMs), trained on extensive datasets of image-text pairs, exhibit strong multimodal understanding capabilities by implicitly learning associations between textual descriptions and image regions. This emergent ability enables zero-shot object detection and segmentation, using techniques that rely on text-image attention maps, without necessarily training on abundant labeled segmentation datasets. However, performance of such methods depends heavily on prompt engineering and manually selected layers or head choices for the attention layers. In this work, we propose a training-free entropy-based measure, InfoScore, to identify the best image-text attention layers for segmentation, providing a more flexible and scalable solution for training-free open-vocabulary segmentation, reducing the additional burden of hyperparamter search. We empirically show that our training-free selection strategy is superior to naive selection strategies. Additionally, we demonstrate that instead of solely relying on text prompts, fine-tuning the image-text attention layer with a single visual example of each class significantly improves segmentation without the need of additional parameters or decoders. Moreover, we show that our methods and findings are general and can be applied across various vision-language models (VLMs).

Certifications: J2C Certification

Submission Length: Long submission (more than 12 pages of main content)

Code: https://github.com/rayat137/Segmentation-From-Attention

Assigned Action Editor: ~Mathieu_Salzmann1

Submission Number: 6028

Loading