Abstract: In domains with privacy constraints, most knowledge resides in siloed datasets, hindering the development of a model with all relevant knowledge for a task.
Clinical NLP is a prime example of these constraints in practice.
Research in this area typically falls back to the canonical setting of sequential transfer learning, where a model pre-trained on large corpora is finetuned on a smaller annotated dataset.
An avenue for knowledge transfer among diverse clinics is multi-step sequential transfer learning since models are more likely to be shared than private clinical data.
This setting poses challenges of cross-linguality, domain diversity, and varying label distributions which undermine generalisation.
We propose SPONGE, an efficient prototypical architecture that leverages competing sparse language representations.
These encompass distributed knowledge and create the necessary level of redundancy for effective transfer learning across multiple datasets.
We identify that prototypical classifiers are critically sensitive to label-recency bias which we mitigate with a novel strategy at inference time. SPONGE in combination with this strategy significantly boosts generalisation performance to unseen data.
With the help of medical professionals, we show that the explainability of our models is clinically relevant.
We make all source code available.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=GDlo4WB0lj
Changes Since Last Submission: We thank the action editor and the reviewers for their valuable input. We agree with your feedback regarding narrowing the scope of the manuscript given our evaluation and experiments. Therefore, we have settled on an updated title in line with the cross-lingual aspect of the tasks and the clinical domain "SPONGE: Competing Sparse Language Representations for Effective Cross-Lingual Knowledge Transfer in Healthcare".
Additionally, in section 7 (Results) p.9 we have polished the reference to Appendix G, where we clarify more thoroughly the experiments we conducted with sequential training in the same language as per the input of reviewer XPbz. For more clarity we renamed and expanded Appendix G, discussing the results of these experiments. We highlight how our method also outperforms the other architectures in a more standard transfer learning setting in the same language, albeit on tasks in the same domain (clinical) with different label spaces.
In addition, we fixed the reference of the following paper "Comply: Learning sentences with complex weights inspired by
fruit fly olfaction"
--added acknowledgements
Assigned Action Editor: ~changjian_shui1
Submission Number: 4971
Loading