Keywords: dataset, natural language processing, digital humanity, large language model, education
Abstract: The formation and circulation of ideas in philosophy have profound implications for pedagogical and scholarly practices. However, traditional analyses often depend on manual reading and subjective interpretation, constrained by human cognitive limits. To address these challenges, we introduce InterIDEAS, a pioneering dataset designed to bridge philosophy and natural language processing (NLP). By merging theories of intertextuality from literary studies with bibliometric techniques and recent LLMs, InterIDEAS enables both quantitative and qualitative analysis of the intellectual, social, and historical relations embedded within these difficult-to-interpret philosophical texts. This dataset not only enhances the study of philosophy but also contributes to the development of language models by providing a training corpus that challenges and enhances their interpretative capacity. InterIDEAS covers over 45,000 pages from key philosophical texts, spanning major thoughts and schools from 1750 to 1950, and features more than 3,150 writers. It manifests the mutual contribution between philosophy and NLP, laying the groundwork for future interdisciplinary research.
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2379
Loading