CiteGCN-LLM: Citation-Aware Research Papers Classification via Graph Convolutional Networks and Large Language Models

ACL ARR 2026 January Submission5581 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Research Papers Classification (RPC), Citation Graph, Graph Convolutional Networks (GCN), Large Language Models (LLM)
Abstract: The rapid expansion of scholarly publications has made automatic classification and recommendation of research articles crucial for efficient scientific knowledge management. In this study, we propose CiteGCN-LLM, a unified citation graph-based classification framework for research papers classification (RPC) for recommendations that synergistically integrates Graph Convolutional Networks (GCN) and Large Language Models (LLM). We introduce two types of citation-based graph constructions: (1) RPCG-1, which captures paper–word relationships enriched by cited-by papers, and (2) RPCG-2, which captures paper–author relationships along with their cited-by papers. To inject rich contextual semantics into the graph, we extract deep textual representations from paper abstracts using a pre-trained transformer-based LLM and fuse them with graph-based embeddings to ensure a tight alignment between textual meaning and structural citation cues. We further employ a Top-k labeling strategy to personalize classification for recommendations to individual user preferences and curate specialized citation datasets to support our experiments. Extensive evaluations of our citation datasets (e.g., arXiv, DBLP, Elsevier, and PubMed) demonstrate that CiteGCN-LLM significantly outperforms state-of-the-art baselines in terms of both classification accuracy and robustness. Our results suggest that combining topological citation signals with a deep language understanding can advance intelligent academic search and recommendation systems.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: graph-based methods
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 5581
Loading