everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
Representation learning is a crucial task of deep learning, which aims to project texts and other symbolic inputs into mathematical embedding. Traditional representation learning encodes symbolic data into an Euclidean space. However, the high dimensionality of the Euclidean space used for embedding words presents considerable computational and storage challenges. Hyperbolic space has emerged as a promising alternative for word embedding, which demonstrates strong representation and generalization capacities, particularly for latent hierarchies of language data. In this paper, we analyze the Skip-Gram Negative-sampling representation learning method in hyperbolic spaces, and explore the potential relationship between the mutual information and hyperbolic embedding. Furthermore, we establish generalization error bounds for hyperbolic embedding. These bounds demonstrate the dimensional parsimony of hyperbolic space and its relationship between the generalization error and the sample size. Finally, we conduct two experiments on the Wordnet dataset and the THUNews dataset, whose results further validate our theoretical properties.