Abstract: Background: Chronic pain poses a significant global health burden, with pertinent contextual information relevant to it encapsulated in free text format across sources such as electronic health records (EHRs), published literature, and social media. Natural language processing (NLP), including recent advances in large language models (LLMs), presents a transformative opportunity to analyze this unstructured data, but the literature is fragmented across disciplines, and there is a need to consolidate existing knowledge, identify gaps in the literature, and inform future research directions in this emerging field. Objective: This review aims to investigate and characterize NLP-based methods designed for chronic pain research. Methods: A search strategy was formulated and executed across PubMed, Web of Science, IEEE Xplore, Scopus, and ACL Anthology to find studies published in English between 2014 and 2025. Included studies were characterized in terms of study design, research question, dataset leveraged, NLP method(s) employed, number of participants included, participant age-groups, and key results. Potential reporting bias and missing results were also analyzed. Results: After screening 155 papers, 34 studies were included in the final review. We observed a noticeable trend toward transformer-based models and LLMs in recent years. A slight majority of studies (13/34) relied on EHR data. Earlier studies commonly used methods such as logistic regression, support vector machines, Latent Dirichlet Allocation (LDA), and word embeddings. In contrast, recent work has shifted toward transformer-based models (e.g., BERT, RoBERTa, BioBERT) and general-purpose LLMs like GPT-3.5 for zero- or few-shot learning frameworks, generally outperforming traditional approaches in supervised learning tasks (e.g., classification F1 > 0.8). Despite progress, persistent challenges remain, including small and nonrepresentative datasets, limited attention to underrepresented populations, and a lack of cross-disciplinary standardization. Conclusions: While application of NLP for chronic pain research is promising, the review revealed a paucity of research on the topic, with opportunities for future explorations. Future work may focus on development and validation of generalizable approaches involving diverse data and cohorts, multimodal data validation systems, public release of data and models, and the development of standardized evaluation metrics to enhance reproducibility and equity in chronic pain research.
External IDs:doi:10.2196/preprints.85105
Loading