Abstract: This article presents a method for reducing the search space of clustering parameters. This is
achieved by selecting the most appropriate data transformation methods and dissimilarity measures at the
stage preceding the actual execution of clustering. To compare the selected methods, it is proposed to use the
silhouette coefficient, which considers class labels from a small labeled dataset as cluster labels. The results of
experimental validation of the proposed approach for clustering news texts are presented.
Loading