Inventive Problem Solving with LLMs: A Benchmark for TRIZ Reasoning

ACL ARR 2026 January Submission7934 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: TRIZ, patent mining, LLM application, LLM Reasoning
Abstract: Large language models have been widely used in invention workflows, but effective support requires more than open-ended generative ideation. TRIZ offers a structured framework that can guide LLMs in inventive problem reasoning. However, evaluations in prior work are small-scale and rarely grounded in patent text. We introduce \ourdataset, a dataset and benchmark for TRIZ reasoning grounded in open technical sources and U.S.\ patents. Furthermore, we design three tasks covering core TRIZ workflow stages, including contradiction prediction, inventive principle prediction, and grounded TRIZ reasoning. Experiments with multiple LLM baselines show that detecting contradictions is easier than recovering correct trade-off pairs, and principle prediction benefits from TRIZ structured reasoning. Our findings also underscore the importance of grounding: semantic retrieval enables evidence-based justifications and helps explain why LLMs fail. Dataset and codes are available here: https://anonymous.4open.science/r/trizbench-E519.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Benchmarking; evaluation; NLP datasets
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: English
Submission Number: 7934
Loading