TL;DR: A novel instance correlation graph-based naive Bayes (ICGNB) algorithm is proposed.
Abstract: Due to its simplicity, effectiveness and robustness, naive Bayes (NB) has continued to be one of the top 10 data mining algorithms. To improve its performance, a large number of improved algorithms have been proposed in the last few decades. However, in addition to Gaussian naive Bayes (GNB), there is little work on numerical attributes. At the same time, none of them takes into account the correlations among instances. To fill this gap, we propose a novel algorithm called instance correlation graph-based naive Bayes (ICGNB). Specifically, it first uses original attributes to construct an instance correlation graph (ICG) to represent the correlations among instances. Then, it employs a variational graph auto-encoder (VGAE) to generate new attributes from the constructed ICG and uses them to augment original attributes.
Finally, it weights each augmented attribute to alleviate the attribute redundancy and builds GNB on the weighted attributes. The experimental results on tens of datasets show that ICGNB significantly outperforms its deserved competitors.Our codes and datasets are available at https://github.com/jiangliangxiao/ICGNB.
Lay Summary: Naive Bayes (NB) is a simple and widely-used method that predicts the classes of unknown items according to some existing items whose classes are known. The original information of existing items is usually limited, and we wanted to obtain more information by representing and using the correlations among items.
Specifically, we construct a graph to represent these correlations and then use a powerful information generation program to mine new information from the constructed graph. Then, we group together the original information and the generated new information, and then alleviate redundant information. Surprisingly, we found that this grouped information works really well and improves NB’s predictive ability, thereby overcoming existing correlated methods in most cases.
Our method has implications for how to predict the classes of unknown items by leveraging correlations among items. To help other researchers explore this idea, we have released our method called ICGNB, along with the method’s settings.
Link To Code: https://github.com/jiangliangxiao/ICGNB
Primary Area: General Machine Learning->Supervised Learning
Keywords: Naive Bayes, Numerical attribute, Instance correlation graph, Variational graph auto-encoder
Submission Number: 1002
Loading