PatentEdits: Using LLMs to Rewrite Patents for Novelty

ACL ARR 2024 August Submission195 Authors

15 Aug 2024 (modified: 17 Sept 2024)ACL ARR 2024 August SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: A patent must be deemed novel and non-obvious in order to be granted by the US Patent Office (USPTO). To meet this criteria, patent writers often revise the description of the claimed invention after official feedback is received. In this work we examine how patents are revised to overcome objections to novelty. First, we present the PatentEdits dataset, the first to contain more than 400,00 granted patents aligned before and after revision. Next, we label the edit actions in our dataset: a given sentence in the patent is either unchanged, edited, or deleted. We also include the prior work cited by the USPTO examiner during review and study how they influence the patent edits. We explore a new research question for the community: how can language models learn to revise documents for originality? We demonstrate the promise of the following model pipeline for novelty revision: 1) the prediction of edit actions on the draft sentences using the prior work followed by 2) the prediction of the revised text with the edit actions.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Controlled text generation, Edit Models, Document Retrieval, Long-context language models, Efficient Attention
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: English
Submission Number: 195
Loading