Persistent Homology of Topic Networks for the Prediction of Reader Curiosity

ACL ARR 2025 February Submission8291 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract:

Reader curiosity, the drive to seek information, is crucial for textual engagement, yet remains relatively underexplored in NLP. Building on Loewenstein’s Information Gap Theory, we introduce a framework that models reader curiosity by quantifying semantic information gaps within a text’s semantic structure. Our approach leverages BERTopic-inspired topic modeling and persistent homology to analyze the evolving topology (connected components, cycles, voids) of a dynamic semantic network derived from text segments, treating these features as proxies for information gaps. To empirically evaluate this pipeline, we collect reader curiosity ratings from participants (n = 49) as they read S. Collins’s “The Hunger Games” novel. We then use the topological features from our pipeline as independent variables to predict these ratings, and experimentally show that they significantly improve curiosity prediction compared to a baseline model (73% vs. 30% explained deviance), validating our approach. This pipeline offers a new computational method for analyzing text structure and its relation to reader engagement.

Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: human behavior analysis, psycho-demographic trait prediction, evaluation and metrics, topic modeling, representation learning, human-in-the-loop, applications
Contribution Types: Model analysis & interpretability, Data resources, Data analysis, Surveys
Languages Studied: English
Submission Number: 8291
Loading