Abstract: In this paper, we propose a novel representation learning framework,
namely HIN2Vec, for heterogeneous information networks
(HINs). The core of the proposed framework is a neural network
model, also called HIN2Vec, designed to capture the rich semantics
embedded in HINs by exploiting different types of relationships
among nodes. Given a set of relationships specified in forms
of meta-paths in an HIN, HIN2Vec carries out multiple prediction
training tasks jointly based on a target set of relationships to learn
latent vectors of nodes and meta-paths in the HIN. In addition to
model design, several issues unique to HIN2Vec, including regularization
of meta-path vectors, node type selection in negative sampling,
and cycles in random walks, are examined. To validate our
ideas, we learn latent vectors of nodes using four large-scale real
HIN datasets, including Blogcatalog, Yelp, DBLP and U.S. Patents,
and use them as features for multi-label node classification and
link prediction applications on those networks. Empirical results
show that HIN2Vec soundly outperforms the state-of-the-art representation
learning models for network data, including DeepWalk,
LINE, node2vec, PTE, HINE and ESim, by 6.6% to 23.8% ofmicro-f1
in multi-label node classification and 5% to 70.8% of MAP in link
prediction.
0 Replies
Loading