Inventive Problem Solving with LLMs: A Benchmark for TRIZ Reasoning

Inventive Problem Solving with LLMs: A Benchmark for TRIZ Reasoning

ACL ARR 2026 January Submission7934 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: TRIZ, patent mining, LLM application, LLM Reasoning

Abstract: Large language models have been widely used in invention workflows, but effective support requires more than open-ended generative ideation. TRIZ offers a structured framework that can guide LLMs in inventive problem reasoning. However, evaluations in prior work are small-scale and rarely grounded in patent text. We introduce \ourdataset, a dataset and benchmark for TRIZ reasoning grounded in open technical sources and U.S.\ patents. Furthermore, we design three tasks covering core TRIZ workflow stages, including contradiction prediction, inventive principle prediction, and grounded TRIZ reasoning. Experiments with multiple LLM baselines show that detecting contradictions is easier than recovering correct trade-off pairs, and principle prediction benefits from TRIZ structured reasoning. Our findings also underscore the importance of grounding: semantic retrieval enables evidence-based justifications and helps explain why LLMs fail. Dataset and codes are available here: https://anonymous.4open.science/r/trizbench-E519.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: Benchmarking; evaluation; NLP datasets

Contribution Types: Model analysis & interpretability, Data resources

Languages Studied: English

Submission Number: 7934

Loading