Abstract: Recent years have seen a significant shift in Artificial Intelligence from model-centric to data-centric approaches, highlighted by the success of large foundational models. Following this trend, despite numerous innovations in graph machine learning model design, graph-structured data often suffers from data quality issues, jeopardizing the progress of Data-centric AI in graph-structured applications. Our proposed tutorial addresses this gap by raising awareness about data quality issues within the graph machine-learning community. We provide an overview of existing topology, imbalance, bias, limited data, and abnormality issues in graph data. Additionally, we highlight recent developments in foundational graph models that focus on identifying, investigating, mitigating, and resolving these issues.
Loading