Abstract: With the ever-increasing urbanization process, modeling people's spatiotemporal activities from their online traces has become a crucial task. State-of-the-art methods for this task rely on cross-modal embedding, which maps items from different modalities (e.g., location, time, text) into the same latent space. Despite their inspiring results, existing cross-modal embedding methods merely capture co-occurrences between items without modeling their high-order interactions. In this paper, we first construct two graphs from raw data records to represent the user interaction graph layer and activity graph layer and propose a hierarchical cross-modal embedding method that takes the high-order relationships into consideration. The key notion behind our method is a novel hierarchical embedding framework with meta-graphs connecting different layers. We introduce both <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">inter-record</i> and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">intra-record</i> meta-graph structures, which enable learning distributed representations that preserve high-order proximities across graphs from different layers. Our empirical experiments on three real-world datasets demonstrate that our method not only outperforms state-of-the-art methods for spatiotemporal activity prediction, but also captures cross-modal proximity at a finer granularity.
0 Replies
Loading