WEBQX: Disclosing the Misalignment of Human Quality and Dense Retrieval in Webpage Optimization

WEBQX: Disclosing the Misalignment of Human Quality and Dense Retrieval in Webpage Optimization

ACL ARR 2026 January Submission3499 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Information Retrieval, Explainable AI (XAI), Retrieval-Augmented Generation (RAG), Web Quality, Human-centered NLP

Abstract: Webpages increasingly serve two audiences: *humans*, who judge credibility and usefulness, and *machines*, which surface pages in retrieval-augmented generation (RAG) pipelines. Yet it remains unclear how improving a page for human readers affects its visibility to dense retrievers. To study this question, we introduce WEBQX, a three-part framework built on the *WebQuality* dataset of 60k webpages annotated along five human-centric dimensions. The framework contains: (1) WEBQX-Estimator, which predicts perceived quality from structural HTML features and exposes feature-level weaknesses using SHAP explanations; (2) WEBQX-OptAgent, a two-agent LLM pipeline that performs targeted HTML rewrites guided by these explanations; and (3) WEBQX-RAGEval, a retrievability evaluation module that evaluates how SHAP-guided HTML edits affect dense retrievability. Our experiments show that although SHAP-guided rewrites consistently improve predicted human quality, they systematically \emph{degrade} dense retrieval performance at both page- and index-level metrics. Together, these results provide the first large-scale evidence of a structural misalignment between human-centered improvements and dense retrievability, highlighting the need for joint optimization strategies in RAG-mediated web access. We will release the code and trained components for reproducibility: https://anonymous.4open.science/r/webqxaisq-B38F/README.md.

Paper Type: Long

Research Area: Information Extraction and Retrieval

Research Area Keywords: Information Retrieval and Text Mining, Human-Centered NLP, Interpretability and Analysis of Models for NLP,

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: Chinese, English, French, German, Spanish

Submission Number: 3499

Loading