A study on Hierarchical Text Summary applying structured attention and graph methodology

Yunyeong Na, San KIM, Jaekwang KIM

Published: 29 Nov 2024, Last Modified: 30 Sept 2025Korean Institute of Intelligent Systems (KIIS)EveryoneCC BY-NC-ND 4.0

Abstract: [Korean] 최근, 긴 텍스트 데이터에서 중요한 내용을 추출하여 독자가 이해하기 쉽도록 텍스트를 요약하는 방법에 관한 연구들이 활발히 진행되고 있다. 하지만 요약이 실제 내용과 다르거나 직관적으로 이해하기 어려울 수 있다. 본 연구는 이러한 한계를 극복하기 위해 어텐션 메커니즘과 그래프의 장점을 결합한 새로운 텍스트 요약 방법론을 제안한다. 주요 단어와 문장의 중요도를 어텐션 메커니즘을 통해 계산하고, 이를 그래프 형태로 변환해 텍스트를 계층적으로 표현 및 요약한다. 이후 노드 정보 종합하여 문서 라벨 예측, 정확도를 비교하여 검증한다. 다양한 카테고리 별 기사로 구성된 NYT 데이터셋의 본문 데이터를 이용해 실험한 결과 단일 방법을 적용하는 경우보다 우수한 성능을 보인다. 키워드: 계층적 관계 표현, 텍스트 요약, 어텐션 메커니즘, 그래프 신경망, 자연어 처리 [English - Translated] Recent research on summarizing long documents seeks to extract salient content that is faithful and easy to understand, yet summaries can still drift from the document’s central theme or be difficult to interpret. We propose a method that unifies attention-based importance estimation with graph-based hierarchical relationship representation for document summarization. The model first computes the importance of words and sentences using an attention mechanism, then converts them into a graph whose nodes encode salient words and sentences and whose edges capture inter-word and inter-sentence relations. This hierarchical graph is aggregated to produce a compact representation for document-level prediction, which we use to validate the approach against labeled data. Experiments on the NYT news corpus indicate improved performance over single-method baselines that rely solely on either attention or graph construction. Keywords: hierarchical relationship representation; text summarization; natural language processing; attention mechanism; graph neural networks