Abstract: Hierarchical Text Classification (HTC) leverages the hierarchical structure of labels to enhance text categorization. Existing methods use a combination of text and structure encoders to generate a composite representation. However, these methods may face challenges when encoding the hierarchy and capturing the label correlations that convey information about the relationships and dependencies among the labels. To address these challenges, we introduce the Hierarchy-Aware Label Correlation (HLC) model in this paper. HLC adopts a customized Graphomer as its structure encoder to learn the hierarchy. Graphormer utilizes self-attention to capture global dependencies and explicit structure encoding mechanisms to model relationships among labels. Additionally, HLC is optimized on Cross-Entropy with Anchor Label (CEAL) loss function, specifically designed to learn the hierarchical label correlations. CEAL introduces an anchor label with a fixed score of zero, distinguishing target labels from non-target ones. This distinctive approach encourages HLC to predict higher scores for true target labels and lower scores for non-target labels compared to the anchor label. We conducted experiments on three benchmark datasets and compared them with existing methods. The results suggest that HLC can be an effective method for HTC.
Loading