Abstract: As dynamic, complex, and non-deterministic webpages proliferate, running controlled web experiments on live webpages is becoming increasingly difficult. To compare algorithms that take webpages as inputs, an experimenter must worry about ever-changing webpages, and also about scalability. Because webpage contents are constantly changing, experimenters must intervene to hold webpages constant, in order to guarantee a fair comparison between algorithms. Because webpages are increasingly customized and diverse, experimenters must test web algorithms over thousands of webpages, and thus need to implement their experiments efficiently. Unfortunately, no existing testing frameworks have been designed for this type of experiment. We introduce Dicer, a framework for running large-scale controlled experiments on live webpages. Dicer's programming model allows experimenters to easily 1) control when to enforce a same-page guarantee and 2) parallelize test execution. The same-page guarantee ensures that all loads of a given URL produce the same response. The framework utilizes a specialized caching proxy server to enforce this guarantee. We evaluate tool on a dataset of 1,000 real webpages, and find it upholds the same-page guarantee with little overhead.
0 Replies
Loading