Abstract: We examine the use of large Transformer-based pretrained language models (PLMs) for the problem of early prediction of lung cancer using free-text patient medical notes of Dutch primary care physicians. Specifically, we investigate: 1) how soft prompt-tuning compares to standard model fine-tuning; 2) whether simpler static word embedding models (WEMs) can be more robust compared to PLMs in highly imbalanced settings; and 3) how models fare when trained on notes from a small number of patients. All our code is available open source in https://bitbucket.org/aumc-kik/prompt_tuning_cancer_prediction/.
Loading