Abstract: Cross-domain text classification utilizes the labeled source domain to train a well-performing classifier for the unlabeled target domain. It is an important task in natural language processing. Owing to the outstanding performance for representations learning, deep learning models are introduced into cross-domain classification to learn domain invariant representations. However, existing methods assume that features in text are independent, ignoring the correlation between features. To this end, we propose a structure-aware method for cross-domain text classification, which uses both the feature semantics and the structure among features to learn the invariant representations. Firstly, a knowledge graph is introduced as an additional resource to capture the structure among features, and the text in both domains is mapped into two sub-graphs. Then, the invariant structure representations between two sub-graphs are learned based on Graph Attention Network (GAT) and correlation alignment. Lastly, the invariant structure representations and the invariant feature representations are combined together to learn higher-level invariant representations for cross-domain classification. Extensive experiments demonstrate that our method achieves better classification accuracy compared with state-of-the-art methods.
Loading