Automatic text classification of prostate cancer malignancy scores in radiology reports using NLP models

Published: 01 Jan 2024, Last Modified: 14 Feb 2025Medical Biol. Eng. Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper presents the implementation of two automated text classification systems for prostate cancer findings based on the PI-RADS criteria. Specifically, a traditional machine learning model using XGBoost and a language model-based approach using RoBERTa were employed. The study focused on Spanish-language radiological MRI prostate reports, which has not been explored before. The results demonstrate that the RoBERTa model outperforms the XGBoost model, although both achieve promising results. Furthermore, the best-performing system was integrated into the radiological company’s information systems as an API, operating in a real-world environment.
Loading