Abstract: Our work introduces an ensemble-based dimensionality reduction approach to efficiently address the high dimensionality of an industrial unlabeled time-series dataset, intending to produce robust data labels. The ensemble comprises a self-supervised learning method to improve data quality, an unsupervised dimensionality reduction to lower the ample feature space, and a chunk-based incremental dimensionality reduction to further increase confidence in data labels. Since the time-series dataset is massive, we divide it into several chunks and evaluate each chunk’s quality using time-series clustering method and metrics. The experiments reveal that clustering performances increased significantly for all the chunks after performing the ensemble approach.
Loading