CharacterHub: Open-Domain Character Profiling for LLM Role-play via Deep Search

CharacterHub: Open-Domain Character Profiling for LLM Role-play via Deep Search

ACL ARR 2026 January Submission8379 Authors

06 Jan 2026 (modified: 07 Jun 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Character Profiling, Large Language Models, LLM Evaluation

Abstract: Building high-quality character profiles is a foundational prerequisite for developing immersive Role-Playing Language Agents (RPLAs). However, existing profiling methods primarily rely on literature-based extraction or LLM-based generation, which suffer from limited media coverage, high manual costs, and a propensity for factual hallucinations. To address these bottlenecks, we propose CharacterHub, an automated character profiling framework powered by deep search agents. Unlike traditional extractive pipelines, our framework autonomously navigates open web sources to retrieve and aggregate heterogeneous information across multiple dimensions. This agentic approach offers unparalleled scalability, extending high-fidelity profiling beyond literary figures to anime, games, and user-generated characters, without human intervention. To rigorously validate our method, we establish an automatic evaluation protocol using large-scale, human-curated data from Fandom as gold reference. Experimental results demonstrate that our dataset achieves strong alignment with reference sources, notably reaching a 83.13% Support Score in the critical personality dimension, while attaining nearly twice the information density of Fandom references. We will publicly release the dataset and associated resources.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: NLP datasets, automatic evaluation of datasets, metrics

Contribution Types: Data resources

Languages Studied: English

Submission Number: 8379

Loading