Abstract: Dynamic graph learning is essential for applications involving temporal networks and requires effective modeling of temporal relationships. Seminal attention-based models like TGAT and DyGFormer rely on sinusoidal time encoders to capture temporal relationships between edge events. In this paper, we study a simpler alternative: the linear time encoder, which avoids temporal information loss caused by sinusoidal functions and reduces the need for high dimensional time encoders. We show that the self-attention mechanism can effectively learn to compute time spans from linear time encodings and extract relevant temporal patterns. Through extensive experiments on six dynamic graph datasets, we demonstrate that the linear time encoder improves the performance of TGAT and DyGFormer in most cases. Moreover, the linear time encoder can lead to significant savings in model parameters with minimal performance loss. For example, compared to a 100-dimensional sinusoidal time encoder, TGAT with a 2-dimensional linear time encoder saves 43% of parameters and achieves higher average precision on five datasets. These results can be readily used to positively impact the design choices of a wide variety of dynamic graph learning architectures.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: * Changed the example in the first paragraph of Section 1
* Changed notations and introduced parameters when first mentioning the sinusoidal time encoder in Section 2.3
* Added Figure 4 to illustrate the similarities and differences between the experimented DyGFormer variants
* Boldfaced win counts in Tables 1 and 2
* Emphasized the effects of scaling and linearity of linear time encoder on UCI in Section 4.2.2
* Further explained why US Legis is easier for time encoders in Section 4.2.3
* Clarified the new notation $d_{tc}$ for the auxiliary experiments in Section 4.3.1
* Added figures and a paragraph to compare the attention score patterns of sinusoidal and linear time encoders in Section 4.3.2 and Appendix D.7
* Added training time and memory usage measurements in Appendix D.4
* Added experiments with TGN and CAWN in Appendix D.5
* Added dynamic node classification results in Appendix D.6
* Fixed typos and grammar error
Assigned Action Editor: ~Mark_Coates1
Submission Number: 4657
Loading