Domain-Focused Versus General Model Efficacy in NLP Tasks on Climate Change

ACL 2024 Workshop ClimateNLP Submission16 Authors

23 May 2024 (modified: 18 Jun 2024)Submitted to ClimateNLP 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: BERT, LLM, NLP, stance detection
TL;DR: This study finds that while ClimateBERT, a climate-specific model, has some benefits, general-purpose models like RoBERTa often perform better for stance detection in climate change discourse.
Abstract: Global warming is a critical concern that requires both scientific understanding and public support for effective policy action. Stance detection using deep learning technologies, particularly large language models (LLMs) like GPT and BERT, can help analyze public and policy opinions on climate change. This study assesses the effectiveness of domain-specific pretraining versus general pretraining for stance detection tasks related to climate change, using a pretrained model named ClimateBERT. The aim is to determine if incorporating climate-specific knowledge into LLMs improves stance detection accuracy in climate-related discourse. The study compares the performance of ClimateBERT with general models like RoBERTa across various climate-related datasets. Results indicate that while domain-specific models offer some advantages, general-purpose models like RoBERTa often achieve higher accuracy and F1 scores, especially in fine-tuning settings. This suggests that robust general-purpose models are often sufficient for specialized tasks, highlighting the need to balance model architecture and domain adaptation for optimal performance in natural language processing applications.
Archival Submission: arxival
Submission Number: 16
Loading