Social sampling

Anirban Dasgupta, Ravi Kumar, D. Sivakumar

2012 (modified: 16 Jul 2019)KDD 2012Readers: Everyone

Abstract: We investigate a class of methods that we call "social sampling," where participants in a poll respond with a summary of their friends' putative responses to the poll. Social sampling leads to a novel trade-off question: the savings in the number of samples(roughly the average degree of the network of participants) vs. the systematic bias in the poll due to the network structure. We provide precise analyses of estimators that result from this idea. With non-uniform sampling of nodes and non-uniform weighting of neighbors' responses, we devise an ideal unbiased estimator. We show that the variance of this estimator is controlled by the second eigenvalue of the normalized Laplacian of the network (the network structure penalty) and the correlation between node degrees and the property being measured (the effective savings factor). In addition, we present a sequence of approximate estimators that are simpler or more realistic or both, and analyze their performance. Experiments on large real-world networks show that social sampling is a powerful paradigm in obtaining accurate estimates with very few samples. At the same time, our results urge caution in interpreting recent results about "expectation vs. intent polling".

0 Replies