Label Informativeness-based Minority Oversampling in Graphs (LIMO)

Rishav Das; Sikta Mohanty; Rucha Bhalchandra Joshi; Subhankar Mishra

Label Informativeness-based Minority Oversampling in Graphs (LIMO)

Rishav Das, Sikta Mohanty, Rucha Bhalchandra Joshi, Subhankar Mishra

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: class imbalance, graph neural networks, mutual information, label informativeness

TL;DR: Label Informativeness-based Minority Oversampling in Graphs

Abstract: Class imbalance is a pervasive issue in many real-world datasets, particularly in graph-structured data, where certain classes are significantly underrepresented. This imbalance can severely impact the performance of Graph Neural Networks (GNNs), leading to biased learning or over-fitting. The existing oversampling techniques often overlook the intrinsic properties of graphs, such as Label Informativeness (LI), which measures the amount of information a neighbor's label provides about a node's label. To address this, we propose Label Informativeness-based Minority Oversampling (LIMO), a novel algorithm that strategically oversamples minority class nodes by augmenting edges to maximize LI. This technique generates a balanced, synthetic graph that enhances GNN performance without significantly increasing data volume. Our theoretical analysis shows that the effectiveness of GNNs is directly proportional to label informativeness, with mutual information as a mediator. Additionally, we provide insights into how variations in the number of inter-class edges influence the LI by analyzing its derivative. Experimental results on various homophilous and heterophilous benchmark datasets demonstrate the effectiveness of LIMO in improving the performance of node classification for different imbalance ratios, with particularly significant improvements observed in heterophilous graph datasets. Our code is available at \url{https://anonymous.4open.science/r/limo-12CC/}

Primary Area: learning on graphs and other geometries & topologies

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10852

Loading