Enhancing Regulatory Compliance QA via Hierarchical Semantic Chunking and Domain-Adaptive Reranking

ACL ARR 2025 May Submission3700 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As AI-enabled research accelerates pharmaceutical technological advances, legislation and regulation worldwide are evolving rapidly and often have a significant impact. Compliance with fragmented, frequently updated national regulations presents a pressing challenge for multinational organizations in the pharmaceutical sector. This paper proposes an AI-powered interactive dialogue system for regulatory compliance that streamlines the interpretation and alignment of evolving regulatory requirements. The system incorporates HiSACC, a hierarchical semantic chunking method, and BGE-Reranker, a domain-adaptive re-ranking model using fine-tuning, designed to optimize the chunking and re-ranking processes. These methods ensure more accurate and context-aware responses to regulatory queries, leveraging large language models and retrieval-augmented generation technology.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: retrieval-augmented generation, question answering, domain adaptation, re-ranking, dense retrieval, document representation, legal NLP, regulatory compliance, grounded generation
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 3700
Loading