PANORAMA: A Dataset and Benchmarks Capturing Decision Trails and Rationales in Patent Examination

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY-NC 4.0
Keywords: Patent Examination, Large Language Model, Benchmark
TL;DR: Patent examination remains hard for NLP since evaluation needs to consider examiners’ reasoning. PANORAMA captures this with 8,143 U.S. records and full trails (applications, prior art, rejections, allowances) split into stepwise benchmarks.
Abstract: Patent examination remains an ongoing challenge in the NLP literature even after the advent of large language models (LLMs), as it requires an extensive yet nuanced human judgment on whether a submitted $\textit{claim}$ meets the statutory standards of $\textit{novelty}$ and $\textit{non-obviousness}$ against previously granted claims—$\textit{prior art}$—in expert domains. Previous NLP studies have approached this challenge as a prediction task (e.g., forecasting grant outcomes) with high-level proxies such as similarity metrics or classifiers trained on historical labels. However, this approach often overlooks the step-by-step evaluations that examiners must make with profound information, including rationales for the decisions provided in $\textit{office actions}$ documents, which also makes it harder to measure the current state of techniques in patent review processes. To fill this gap, we construct PANORAMA, a dataset of 8,143 U.S. patent examination records that preserves the full decision trails, including original applications, all cited references, $\textit{Non-Final Rejections}$, and $\textit{Notices of Allowance}$. Also, PANORAMA decomposes the trails into sequential benchmarks that emulate patent professionals' patent review processes and allow researchers to examine large language models' capabilities at each step of them. Our findings indicate that, although LLMs are relatively effective at retrieving relevant prior art and pinpointing the pertinent paragraphs, they struggle to assess the novelty and non-obviousness of patent claims. We discuss these results and argue that advancing NLP, including LLMs, in the patent domain requires a deeper understanding of real-world patent examination. Our dataset is openly available at https://huggingface.co/datasets/LG-AI-Research/PANORAMA.
Croissant File: json
Dataset URL: https://huggingface.co/datasets/LG-AI-Research/PANORAMA
Code URL: https://github.com/LGAI-Research/PANORAMA
Primary Area: Datasets & Benchmarks for applications in language modeling and vision language modeling
Submission Number: 2318
Loading