Abstract: The $V$-matrix Support Vector Machine (VSVM) is an innovative machine learning method recently proposed by Vapnik and Izmailov, which integrates positional relationships among training samples into the model learning, yielding the decision via conditional probability. But it overlooks the distribution information hidden in the data which plays a pivotal role in the training process and neglects the utilization of testing samples. To fully exploit the distribution information of the data, this paper proposes a novel Distribution Metric Based $V$-matrix Support Vector Machine (DVSVM) building upon VSVM. DVSVM incorporates the distributional information implicit in the data by measuring the distances between samples using the Wasserstein distance. Compared to VSVM, it also additionally accounts for the positional relationships of testing samples. It is further theoretically proved that VSVM can degenerate from DVSVM under certain conditions. Experimental results on several synthetic datasets and real-world disease datasets demonstrate the superiority of DVSVM.
Loading