Hypergraph-Based Machine Learning for Robust Handling of Missing Data

Vivatchai Kaveeta; Prompong Sugunnasil

Hypergraph-Based Machine Learning for Robust Handling of Missing Data

Vivatchai Kaveeta, Prompong Sugunnasil

27 Sept 2024 (modified: 28 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Hypergraph, Machine Learning, Missing Data

TL;DR: This paper proposes a hypergraph-based machine learning method that directly learns from datasets with missing data, eliminating the need for imputation.

Abstract: Handling missing data is a major challenge in machine learning where missing values are common in datasets. This work introduces a hypergraph representation constructed from datasets containing missing values. The method does not rely on traditional techniques like deletion or data imputations. The approach constructs hypergraphs directly from the dataset, preserving the relationships between variables and modeling multi-variable interactions. This enables the model to capture the dataset structure in ways other methods may overlook. The proposed hypergraph learning method can be applied to classification and regression tasks. For real-world evaluation, we use the MIMIC-III and Adult datasets focusing on classification performance. Additionally, synthetic datasets with controlled missingness are used to evaluate the method's effectiveness across varying degrees of missingness. When compared with imputation and prediction techniques, the hypergraph approach achieves competitive or superior performance. Specifically, our method maintains high performance in scenarios with significant levels of missing data. We demonstrate that the hypergraph representation not only offers a more resilient framework for learning from datasets with missing data. But also scales effectively across diverse datasets and prediction tasks. The method maintains stable performance under various degrees of missingness, demonstrating its potential as a valuable machine learning tool with high data reliability and prediction quality.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10757

Loading