Abstract: Pattern discovery in data plays a crucial role across diverse domains, including healthcare, risk assessment, and machinery maintenance. In contrast to black-box deep learning models, symbolic rule discovery emerges as a key data mining task, generating human-interpretable rules that offer both transparency and intuitive explainability. This paper introduces the optimal pattern detection tree (OPDT) for binary classification, a rule-based machine learning model based on novel mixed integer programming to extract an optimal pattern in data. This optimization-based approach discovers a hidden underlying pattern in datasets, when it exists, by identifying an optimal rule that maximizes coverage while minimizing the false positive rate due to misclassification. Our computational experiments show that OPDT discovers a pattern with optimality guarantees on moderately sized datasets within reasonable runtime.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Satoshi_Hara1
Submission Number: 6854
Loading