Understanding Large Language Models' Ability on  Interdisciplinary Research

Yuanhao Shen; Daniel Xavier de Sousa; Ricardo Marçal de Andrade Nascimento; Ali Asad; Hongyu Guo; Xiaodan Zhu

Understanding Large Language Models' Ability on Interdisciplinary Research

Yuanhao Shen, Daniel Xavier de Sousa, Ricardo Marçal de Andrade Nascimento, Ali Asad, Hongyu Guo, Xiaodan Zhu

Published: 10 Jun 2025, Last Modified: 14 Jul 2025ICML 2025 World Models WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Interdisciplinary Research, NLP for Science, Benchmark, Dataset

Abstract: This work introduces IDRBench --- a pioneering benchmark featuring an expert-annotated dataset and a suite of tasks tailored to evaluate LLMs' capabilities in proposing valuable research ideas for \textbf{Interdisciplinary Research (IDR)}. To ensure a reliable evaluation, our dataset consists of scientific publications sourced from the ArXiv platform covering six distinct disciplines and is annotated by domain experts with diverse academic backgrounds.The design of evaluation tasks in IDRBench follows a progressive, real-world perspective, reflecting the natural stages of interdisciplinary research development, including 1) IDR Paper Identification, 2) IDR Idea Integration}, and 3) IDR Idea Recommendation. Using IDRBench, we construct baselines across 10 LLMs and observe that despite fostering some level of IDR awareness, LLMs still struggle to produce quality IDR ideas. These findings could not only spark new research directions, but also help to develop next-generation LLMs that excel in interdisciplinary research.

Submission Number: 25

Loading