Abstract: Dynamic graph learning is essential for applications involving temporal networks and requires effective modeling of temporal relationships.
Seminal attention-based models like TGAT and DyGFormer rely on sinusoidal time encoders to capture temporal dependencies between edge events. Prior work justified sinusoidal encodings because their inner products depend on the time spans between events, which are crucial features for modeling inter-event relations. However, sinusoidal encodings inherently lose temporal information due to their many-to-one nature and therefore require high dimensions. In this paper, we rigorously study a simpler alternative: the linear time encoder, which avoids temporal information loss caused by sinusoidal functions and reduces the need for high-dimensional time encoders. We show that the self-attention mechanism can effectively learn to compute time spans between events from linear time encodings and extract relevant temporal patterns. Through extensive experiments on six dynamic graph datasets, we demonstrate that the linear time encoder improves the performance of TGAT and DyGFormer in most cases. Moreover, the linear time encoder can lead to significant savings in model parameters with minimal performance loss. For example, compared to a 100-dimensional sinusoidal time encoder, TGAT with a 2-dimensional linear time encoder saves 43% of parameters and achieves higher average precision on five datasets. While both encoders can be used simultaneously, our study highlights the often-overlooked advantages of linear time features in modern dynamic graph models. These findings can positively impact the design choices of various dynamic graph learning architectures and eventually benefit temporal network applications such as recommender systems, communication networks, and traffic forecasting. The experimental code is available at: https://github.com/hsinghuan/dg-linear-time.git.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: * Changed the example in the first paragraph of Section 1
* Changed notations and introduced parameters when first mentioning the sinusoidal time encoder in Section 2.3
* Added Figure 4 to illustrate the similarities and differences between the experimented DyGFormer variants
* Boldfaced win counts in Tables 1 and 2
* Emphasized the effects of scaling and linearity of linear time encoder on UCI in Section 4.2.2
* Further explained why US Legis is easier for time encoders in Section 4.2.3
* Clarified the new notation $d_{tc}$ for the auxiliary experiments in Section 4.3.1
* Added figures and a paragraph to compare the attention score patterns of sinusoidal and linear time encoders in Section 4.3.2 and Appendix D.7
* Added training time and memory usage measurements in Appendix D.4
* Added experiments with TGN and CAWN in Appendix D.5
* Added dynamic node classification results in Appendix D.6
* Fixed typos and grammar error
* Added references
* Increased abstract length
Code: https://github.com/hsinghuan/dg-linear-time.git
Supplementary Material: zip
Assigned Action Editor: ~Mark_Coates1
Submission Number: 4657
Loading