Specializing SAM: Online Adaptation of the Segment Anything Model for Interactive Segmentation in Uncommon Situations

15 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: test time adaptation, interactive segmentation, sam, segment anything, segment anything model, online, fine-tuning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A system to test-time adapt SAM during usage on uncommon data for interactive segmentation.
Abstract: Interactive segmentation is the task of segmenting an object with the help of user guidance. It is mostly used to create ground truth segmentation masks for object instances more efficiently. Recently, the Segment Anything Model (SAM) has been published to provide a foundation model for segmentation based on user-generated prompts. Despite being trained on the largest instance segmentation dataset to this date (SA-1B), we show that the model fails at the task of interactive segmentation when confronted with situations that do not comport with the initial training data. Such situations may however occur when the model is used in practice. To alleviate these problems, we use the information that becomes available during the interaction to adapt the model to the dataset while being in use. In order to not impede any real time experience desirable to the user, we construct our method with the aim of minimizing computational overhead. In our experiments we will demonstrate the efficacy of the proposed adaptation method on twelve different datasets which are uncommon to SAM's initial training data, with four of them being medical segmentation datasets. With our method we are able to cause reductions of up to 16.93 percentage points in the $FR_{20}@85$ metric, and reductions of up to 18.43 percentage points in the $FR_{30}@90$ metric. Additionally, there is an improvement of up to 3.311 clicks in the $NoC_{30}@90$ on ten out of twelve datasets.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 285
Loading