Automatic Morphological Classification of Galaxies: Convolutional Autoencoder and Bagging-based Multiclustering Model

Chichun Zhou, Yizhou Gu, Zesen Lin, Guanwen Fang

Published: 01 Feb 2022, Last Modified: 28 Sept 2024The Astronomical JournalEveryoneCC BY 4.0

Abstract: In order to obtain morphological information of unlabeled galaxies, we present an unsupervised machine-learning (UML) method for morphological classification of galaxies, which can be summarized as two aspects: (1) the methodology of convolutional autoencoder (CAE) is used to reduce the dimensions and extract features from the imaging data; (2) the bagging-based multiclustering model is proposed to obtain the classifications with high confidence at the cost of rejecting the disputed sources that are inconsistently voted. We apply this method on the sample of galaxies with H < 24.5 in CANDELS. Galaxies are clustered into 100 groups, each contains galaxies with analogous characteristics. To explore the robustness of the morphological classifications, we merge 100 groups into five categories by visual verification, including spheroid, early-type disk, late-type disk, irregular, and unclassifiable. After eliminating the unclassifiable category and the sources with inconsistent voting, the purity of the remaining four subclasses are significantly improved. Massive galaxies (M* > 1010Me) are selected to investigate the connection with other physical properties. The classification scheme separates galaxies well in the U − V and V − J color space and Gini–M20 space. The gradual tendency of Sérsic indexes and effective radii is shown from the spheroid subclass to the irregular subclass. It suggests that the combination of CAE and multiclustering strategy is an effective method to cluster galaxies with similar features and can yield high-quality morphological classifications. Our study demonstrates the feasibility of UML in morphological analysis that would develop and serve the future observations made with China Space Station telescope.