Financial Networks and Other Adventures in Graph Learning

Béni Egressy

Published: 01 Jan 2024, Last Modified: 14 May 2025undefined 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This thesis investigates graph problems from both a machine learning and a theoretical perspective. In particular, it explores financial applications, focusing on fraud detection, central bank bailouts, and lending. The thesis is split into two main parts, broadly corresponding to the two perspectives. In the first part, we modify Graph Neural Networks (GNNs) to improve their expressivity and to tailor them to different graph types. We propose a set of modifications to standard GNNs, enabling them to detect complex subgraph patterns in directed multigraphs. We then use these modified GNNs for financial fraud detection, producing state-of-the-art results in this area. We also explore enhancing GNN expressivity and performance through precomputed node features. Spatial node embeddings prove the most successful empirically. However, when comparing the node features, we find that increasing theoretical expressivity often does not translate into improved empirical performance. To explore this further, we propose a new tool, Graphtester, for analyzing expressivity systematically. The tool provides stark evidence that expressivity is rarely the bottleneck for GNN performance in real-world datasets. The second part of the thesis focuses on the theoretical analysis of financial networks, i.e., graphs consisting of financial institutions linked by outstanding debts. We study the problem of optimal bank bailouts by a central bank, determining the computational complexity under various conditions and highlighting potential vulnerabilities to exploitation. Subsequently, we analyze the impact of network structure on lending behavior within financial networks. We explore how network topology influences lending prices and network stability. Finally, we briefly delve into the realm of large language models (LLMs). We propose a simple and effective method for incorporating specific keywords or topics into text generation.