Mutual Learning for SAM Adaptation: A Dual Collaborative Network Framework for Source-Free Domain Transfer
TL;DR: We propose a mutual learning framework with dual collaborative networks for source-free SAM adaptation, using dynamic role-switching and feature-based reliability to enhance generalization across target domains.
Abstract: Segment Anything Model (SAM) has demonstrated remarkable zero-shot segmentation capabilities across various visual tasks. However, its performance degrades significantly when deployed in new target domains with substantial distribution shifts. While existing self-training methods based on fixed teacher-student architectures have shown improvements, they struggle to ensure that the teacher network consistently outperforms the student under severe domain shifts. To address this limitation, we propose a novel Collaborative Mutual Learning Framework for source-free SAM adaptation, leveraging dual-networks in a dynamic and cooperative manner. Unlike fixed teacher-student paradigms, our method dynamically assigns the teacher and student roles by evaluating the reliability of each collaborative network in each training iteration. Our framework incorporates a dynamic mutual learning mechanism with three key components: a direct alignment loss for knowledge transfer, a reverse distillation loss to encourage diversity, and a triplet relationship loss to refine feature representations. These components enhance the adaptation capabilities of the collaborative networks, enabling them to generalize effectively to target domains while preserving their pre-trained knowledge. Extensive experiments on diverse target domains demonstrate that our proposed framework achieves state-of-the-art adaptation performance.
Lay Summary: The Segment Anything Model (SAM) is an advantage tool that can automatically separate objects in images without needing extra training. While SAM works well in familiar settings, its performance drops when it is used in new environments that are very different from the ones it was designed for. Existing approaches to help SAM adapt, which rely on a fixed teacher-student learning setup, often fall short when the differences between environments are too large. To address this issue, we developed a new approach that allows two networks to work together and learn from each other in a flexible and dynamic way. Instead of having fixed roles, the networks switch between being "teacher" and "student" based on how well they perform during training. We also designed specific techniques to help the networks share knowledge, improve their understanding, and become better at handling new environments. Our method helps SAM adapt more effectively to new situations while keeping its original strengths. Tests on different types of data show that our approach achieves remarkable results, making it a promising tool for real-world applications.
Primary Area: Applications->Computer Vision
Keywords: computer vision, transfer learning, domain adaptation
Submission Number: 3966
Loading