Identifying uncertain galaxy morphologies using unsupervised learning

K. Edwards, M. Gaber

Research output: Contribution to conferencePaperpeer-review

115 Downloads (Pure)


With the onset of massive cosmological data collection through mediums such as the Sloan Digital Sky Survey (SDSS), galaxy classification has been accomplished for the most part with the help of citizen science communities like Galaxy Zoo. However, an analysis of one of the Galaxy Zoo morphological classification data sets has shown that a significant majority of all classified galaxies are, in fact, labelled as "Uncertain". This has driven us to conduct experiments with data obtained from the SDSS database using each galaxy's right ascension and declination values, together with the Galaxy Zoo morphology class label, and the k-means clustering algorithm. This paper identifies the best attributes for clustering using a heuristic approach and, accordingly, applies an unsupervised learning technique in order to improve the classification of galaxies labelled as "Uncertain" and increase the overall accuracies of such data clustering processes. Through this heuristic approach, it is observed that the accuracy of classes-to-clusters evaluation, by selecting the best combination of attributes via information gain, is further improved by approximately 10-15%. An accuracy of 82.627% was also achieved after conducting various experiments on the galaxies labelled as "Uncertain" and replacing them back into the original data set. It is concluded that a vast majority of these galaxies are, in fact, of spiral morphology with a small subset potentially consisting of stars, elliptical galaxies or galaxies of other morphological variants.
Original languageEnglish
Publication statusPublished - 9 Jun 2013
EventThe 12th International Conference on Artificial Intelligence and Soft Computing ICAISC 2013 - Zakopane, Poland
Duration: 9 Jun 201313 Jun 2013


ConferenceThe 12th International Conference on Artificial Intelligence and Soft Computing ICAISC 2013


Dive into the research topics of 'Identifying uncertain galaxy morphologies using unsupervised learning'. Together they form a unique fingerprint.

Cite this