How Deeply Do LLMs Internalize Human Citation Practices? A Graph-Structural and Embedding-Based Evaluation
Track: tiny / short paper (up to 5 pages)
Keywords: Large Language Models (LLMs), Citation Analysis, Scientific References, Graph Theory, Embedding Similarity, Citation Networks
TL;DR: Keywords: Large Language Models (LLMs), Citation Analysis, Scientific References, Graph Theory, Embedding Similarity, Citation Networks, AI in Research, Automated Citation Generation, Scientific Workflows, Machine Learning Evaluation
Abstract: As Large Language Models (LLMs) integrate into scientific workflows, understanding how they conceptualize the literature becomes critical. We compare LLM-generated citation suggestions with real references from top AI conferences (AAAI, NeurIPS, ICML, ICLR), analyzing key citation graph properties—centralities, clustering coefficients, and structural differences. Using OpenAI embeddings for paper titles, we quantify the alignment of LLM-generated citations with ground truth references. Our findings reveal that LLM-generated citations closely resemble human references in these distributional properties, deviating significantly from random baselines.
Submission Number: 7
Loading