Keywords: Relation extraction, Prompt Optimization, Deep Reinforcement Learning, Domain Adaptation
Abstract: Relation extraction (RE) is a fundamental task in information extraction. Still, existing supervised approaches rely heavily on large-scale annotated data, limiting their applicability in domain-specific and low-resource scenarios. Prompt-based methods with large language models (LLMs) provide a parameter-efficient alternative; however, their performance is susceptible to prompt design, which often requires extensive domain expertise and heuristic trial-and-error. We propose REPO, a deep reinforcement learning (DRL)-based automated prompt optimization framework for domain relation extraction. REPO formulates prompt construction as a structured, sequential decision-making problem, optimizing prompt quality through interaction with a black-box LLM. To enable efficient and stable optimization, we introduce a two-stage framework comprising an initial prompt-construction stage that generates semantically grounded candidates and a DRL-based refinement stage that iteratively improves prompts within a constrained, domain-aware action space. We further design a composite evaluation metric that integrates extraction accuracy and semantic consistency to serve as a dense reward signal. Extensive experiments on multiple relation extraction datasets across medical, financial, legal, and news domains demonstrate that REPO consistently outperforms existing prompt-based methods and supervised baselines. Ablation studies further confirm the effectiveness and robustness of the proposed reinforcement learning-based prompt optimization strategy.
Paper Type: Long
Research Area: Information Extraction and Retrieval
Research Area Keywords: Information Extraction
Contribution Types: NLP engineering experiment
Languages Studied: Chinese
Submission Number: 7302
Loading