Abstract: The rapid proliferation of false information on the internet poses a significant challenge before, during, and after disasters, emphasizing the critical need for domain-specific automatic fact-checking systems. In this study, we introduce DisFact, a new fact-checking pipeline, and a dataset of disaster-related claims generated from the Federal Emergency Management Agency (FEMA) press releases and disaster declarations. Our retrieval method involves no model training, making it more efficient and less resource-intensive. It starts by breaking a lengthy document into sentences; we further apply embeddings to calculate the relevancy score between a claim and document pairs and then compute the similarity score between claims and sentences to rank the retrieved evidence(s). For claim verification, we utilize a deep learning approach that comprises a transformer-based embedding with a feedforward neural network. The experimental findings demonstrate that our fact-checking models achieve top performance on our custom disaster dataset. Furthermore, our models outperform other state-of-the-art models on FEVER and SciFact shared tasks, underscoring the effectiveness of our approach and its adaptability in handling longer documents and generalizing across diverse fact-checking datasets. DisFact signifies a pivotal advancement in automated fact-checking, emphasizing simplicity, accuracy, and computational efficiency. DisFact dataset and code are available on GitHub (DisFact Dataset and Code - https://github.com/abdul0366/DisFact).
Loading