Deep Reinforcement Learning Based Automated Prompt Optimization for Domain Relation Extraction

Deep Reinforcement Learning Based Automated Prompt Optimization for Domain Relation Extraction

ACL ARR 2026 January Submission7302 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Relation extraction, Prompt Optimization, Deep Reinforcement Learning, Domain Adaptation

Abstract: Relation extraction (RE) is a fundamental task in information extraction. Still, existing supervised approaches rely heavily on large-scale annotated data, limiting their applicability in domain-specific and low-resource scenarios. Prompt-based methods with large language models (LLMs) provide a parameter-efficient alternative; however, their performance is susceptible to prompt design, which often requires extensive domain expertise and heuristic trial-and-error. We propose REPO, a deep reinforcement learning (DRL)-based automated prompt optimization framework for domain relation extraction. REPO formulates prompt construction as a structured, sequential decision-making problem, optimizing prompt quality through interaction with a black-box LLM. To enable efficient and stable optimization, we introduce a two-stage framework comprising an initial prompt-construction stage that generates semantically grounded candidates and a DRL-based refinement stage that iteratively improves prompts within a constrained, domain-aware action space. We further design a composite evaluation metric that integrates extraction accuracy and semantic consistency to serve as a dense reward signal. Extensive experiments on multiple relation extraction datasets across medical, financial, legal, and news domains demonstrate that REPO consistently outperforms existing prompt-based methods and supervised baselines. Ablation studies further confirm the effectiveness and robustness of the proposed reinforcement learning-based prompt optimization strategy.

Paper Type: Long

Research Area: Information Extraction and Retrieval

Research Area Keywords: Information Extraction

Contribution Types: NLP engineering experiment

Languages Studied: Chinese

Submission Number: 7302

Loading