Abstract: Information prioritization plays an important role in the way humans perceive and understand the world. Homepage layouts serve as a tangible proxy for this prioritization. In this work, we present NewsHomepages, a novel and massive dataset dataset of over 3,000 new website homepages, including local, national, and topic-specific outlets, captured twice daily over a three-year period. Then, we develop models to perform pairwise comparisons between news items to infer editorial preferences expressed in homepage layouts, showing over 0.7 F1 score compared with human judgement. To demonstrate the importance of these learned preferences, we (1) perform a novel anaylsis showing that outlets across the political spectrum share surprising preference agreements and (2) apply our models to rank-order a collection of local city council policies passed over a ten-year period in San Francisco, assessing their ``newsworthiness''. Our findings lay the groundwork for leveraging implicit cues to deepen our understanding of information prioritization.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: Resources and Evaluation, Computational Journalism, Computational Social Science
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: English
Submission Number: 6005
Loading