Feature structure distillation with Centered Kernel Alignment in BERT transferring

Published: 01 Jan 2023, Last Modified: 25 Jan 2025Expert Syst. Appl. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We adapt CKA to KD for more informative transfer of structures in BERT.•We categorize intra-feature, local inter-feature, and global inter-feature structure.•We propose memory augmentation for global structures distillation method.•We empirically analyze the quantitative and qualitative analysis.•We validate practical usefulness over a wide range of language understanding tasks.
Loading