Automated ICF Coding of Rehabilitation Notes for Low-Resource Languages via Continual Training of Language Models

Kevin Roitero, Andrea Martinuzzi, Maria Teresa Armellin, Gabriella Paparella, Alberto Maniero, Vincenzo Della Mea

Published: 2023, Last Modified: 12 Jan 2026MIE 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The coding of medical documents and in particular of rehabilitation notes using the International Classification of Functioning, Disability and Health (ICF) is a difficult task showing low agreement among experts. Such difficulty is mainly caused by the specific terminology that needs to be used for the task. In this paper, we address the task developing a model based on a large language model, BERT. By leveraging continual training of such a model using ICF textual descriptions, we are able to effectively encode rehabilitation notes expressed in Italian, an under-resourced language.