Network-Agnostic Knowledge Transfer for Medical Image Segmentation

Shuhang Wang; Eugene Cheah; Elham Yousef Kalafi; Mercy Asiedu; Alex Benjamin; Vivek Kumar Singh; Ge Zhang; Viksit Kumar; Anthony Edward Samir

Network-Agnostic Knowledge Transfer for Medical Image Segmentation

Shuhang Wang, Eugene Cheah, Elham Yousef Kalafi, Mercy Asiedu, Alex Benjamin, Vivek Kumar Singh, Ge Zhang, Viksit Kumar, Anthony Edward Samir

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Knowledge Transfer, Deep Learning, Medical Image Segmentation, Pseudo Annotation

Abstract: Conventional transfer learning leverages weights of pre-trained networks, but mandates the need for similar neural architectures. Alternatively, knowledge distillation can transfer knowledge between heterogeneous networks but often requires access to the original training data or additional generative networks. Knowledge transfer between networks can be improved by being agnostic to the choice of network architecture and reducing the dependence on original training data. We propose a knowledge transfer approach from a teacher to a student network wherein we train the student on an independent transferal dataset, whose annotations are generated by the teacher. Experiments were conducted on five state-of-the-art networks for semantic segmentation and seven datasets across three imaging modalities. We studied knowledge transfer from a single teacher, combination of knowledge transfer and fine-tuning, and knowledge transfer from multiple teachers. The student model with a single teacher achieved similar performance as the teacher; and the student model with multiple teachers achieved better performance than the teachers. The salient features of our algorithm include: 1) no need for original training data or generative networks, 2) knowledge transfer between different architectures, 3) ease of implementation for downstream tasks by using the downstream task dataset as the transferal dataset, 4) knowledge transfer of an ensemble of models, trained independently, into one student model. Extensive experiments demonstrate that the proposed algorithm is effective for knowledge transfer and easily tunable.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We propose to transfer the knowledge of a neural network (teacher) to another independent one (student) by training the student on a transferal dataset whose annotations are generated by the teacher.

Reviewed Version (pdf): https://openreview.net/references/pdf?id=eLsZCogbmy

11 Replies

Loading