Gaussian mixture model with local consistency: a hierarchical minimum message length-based approach

Published: 01 Jan 2024, Last Modified: 26 Aug 2024Int. J. Mach. Learn. Cybern. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Gaussian mixture model (GMM) is widely used in many domains, e.g. data mining. The unsupervised learning of the finite mixture (ULFM) model based on the minimum message length (MML) criterion for mixtures enables adaptive model selection and parameter estimates. However, some datasets have a hierarchical structure. If the MML criterion does not consider the hierarchical structure of the a priori, the a priori coding length in the criterion is inaccurate. It is difficult to achieve a good trade-off between the model’s complexity and its goodness of fitting. Therefore, a locally consistent GMM with the hierarchical MML criterion (GM-HMML) algorithm is proposed. Firstly, the MML criterion determines the mixing probability (annihilation of components). To accurately control the competition between these relative necessary components, a hierarchical MML is proposed. Secondly, the hierarchical MML criterion is regularized using the graph Laplacian. The manifold structure is incorporated into the parameter estimator to avoid possible overfitting problems caused by the fine-grained prior. The presented MML criterion enhances the degree of component annihilation, which not only does not annihilate the necessary components but also reduces the iterations. The proposed approach is testified on the real datasets and achieves good model order and clustering accuracy.
Loading