Combating health misinformation with fusion-based credible retrieval techniques

Published: 01 Jan 2025, Last Modified: 11 Nov 2025Health Informatics J. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This study aims to combat health misinformation by enhancing the retrieval of credible health information using effective fusion-based techniques. It focuses on clustering-based subset selection to improve data fusion performance. Five clustering methods — two K-means variants, Agglomerative Hierarchical (AH) clustering, BIRCH, and Chameleon — are evaluated for selecting optimal subsets of information retrieval systems. Experiments are conducted on two health-related datasets from the TREC challenge. The selected subsets are used in data fusion to boost retrieval quality and credibility. AH and BIRCH outperform other methods in identifying effective IR subsets. Using AH-based fusion of up to 20 systems results in a 60% gain in MAP and over a 30% increase in NDCG_UCC, a credibility-focused metric, compared to the best single system. Clustering-based fusion strategies significantly enhance the retrieval of trustworthy health content, helping to reduce misinformation. These findings support incorporating advanced data fusion into health information retrieval systems to improve access to reliable information. The source code of this research is publicly available at https://github.com/Gary752752/DataFusion.
Loading