Abstract: The demand for automatic extraction of true information (i.e., truths) from conflicting multi-source data has soared recently. A variety of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">truth discovery</i> methods have witnessed great successes via jointly estimating source reliability and truths. All existing truth discovery methods focus on providing a point estimator for each object's truth, but in many real-world applications, confidence interval estimation of truths is more desirable, since confidence interval contains richer information. To address this challenge, in this paper, we propose a novel truth discovery method ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ETCIBoot</i> ) to construct confidence interval estimates as well as identify truths, where the bootstrapping techniques are nicely integrated into the truth discovery procedure. Due to the properties of bootstrapping, the estimators obtained by <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ETCIBoot</i> are more accurate and robust compared with the state-of-the-art truth discovery approaches. The proposed framework is further adapted to deal with large-scale truth discovery task in distributed paradigm. Theoretically, we prove the asymptotical consistency of the confidence interval obtained by <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ETCIBoot</i> . Experimentally, we demonstrate that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ETCIBoot</i> is not only effective in constructing confidence intervals but also able to obtain better truth estimates.
0 Replies
Loading