Lipschitz normalization for self-attention layers with application to graph neural networksDownload PDFOpen Website

2021 (modified: 23 Sept 2022)ICML 2021Readers: Everyone
Abstract: Attention based neural networks are state of the art in a large range of applications. However, their performance tends to degrade when the number of layers increases. In this work, we show that en...
0 Replies

Loading