Abstract: Data fusion synthesizes results from diverse sources, but the performance impact remains mysterious. This research reveals the inner workings of fusion through machine prophecy. Constructing a random forest model using TREC dataset benchmarks, we accurately predicted the performance of two fusion algorithms. The model achieved near perfect R2 scores above 0.9 by exploiting meaningful statistical features. Compared to linear regression, the tree-based ensemble provides superior insight. The importance of newly identified drivers, like P@1000 metrics, is quantified. With this prescient view, researchers can refine fusion techniques to offer better search. By uncovering the secrets of fusion success, machine learning guides the path to retrieval excellence.
Loading