Keywords: Natural hazard impacts, Retrieval-Augmented Generation, Information extraction
Abstract: Understanding how natural hazards such as floods, droughts, and storms turn into disasters requires robust impact data. This paper develops a comprehensive global dataset of natural hazard impacts from peer-reviewed literature, synthesizing existing in-depth knowledge. Leveraging advances in natural language processing (NLP) and large language models (LLMs), we mapped over 12,000 open-access articles published since 1980 on climatological, hydrological, and meteorological disasters. Evaluation results show that precision ranges from 0.85 to 1 for the extraction of quantitative impact information. Our novel method using Retrieval-Augmented Generation captures detailed impact data on a wide range of sectors and systems, significantly improving the granularity and geographical coverage compared to existing global datasets. As such, this work fills critical gaps in natural hazard research, providing information on both direct and indirect disaster consequences.
Archival Submission: arxival
Submission Number: 19
Loading