The VIMOS Public Extragalactic Redshift Survey (VIPERS): a support vector machine classification of galaxies, stars, and AGNs

K. Malek, A. Solarz, A. Pollo, A. Fritz, B. Garilli, M. Scodeggio, A. Iovino, B. R. Granett, U. Abbas, C. Adami, S. Arnouts, J. Bel, M. Bolzonella, D. Bottini, E. Branchini, A. Cappi, J. Coupon, O. Cucciati, I. Davidzon, G. De LuciaS. de la Torre, P. Franzetti, M. Fumana, L. Guzzo, O. Ilbert, J. Krywult, V. Le Brun, O. Le Fevre, D. Maccagni, F. Marulli, H. J. McCracken, L. Paioro, M. Polletta, H. Schlagenhaufer, L. A. M. Tasca, R. Tojeiro, D. Vergani, A. Zanichelli, A. Burden, C. Di Porto, A. Marchetti, C. Marinoni, Y. Mellier, L. Moscardini, R. C. Nichol, J. A. Peacock, W. J. Percival, S. Phleps, M. Wolk, G. Zamorani

Research output: Contribution to journalArticlepeer-review

Abstract

Aims. The aim of this work is to develop a comprehensive method for classifying sources in large sky surveys and to apply the techniques to the VIMOS Public Extragalactic Redshift Survey (VIPERS). Using the optical (u∗,g′,r′,i′) and near-infrared (NIR) data (z′, Ks), we develop a classifier, based on broad-band photometry, for identifying stars, active galactic nuclei (AGNs), and galaxies, thereby improving the purity of the VIPERS sample.

Methods. Support vector machine (SVM) supervised learning algorithms allow the automatic classification of objects into two or more classes based on a multidimensional parameter space. In this work, we tailored the SVM to classifying stars, AGNs, and galaxies and applied this classification to the VIPERS data. We trained the SVM using spectroscopically confirmed sources from the VIPERS and VVDS surveys.

Results. We tested two SVM classifiers and concluded that including NIR data can significantly improve the efficiency of the classifier. The self-check of the best optical + NIR classifier has shown 97% accuracy in the classification of galaxies, 97% for stars, and 95% for AGNs in the 5-dimensional colour space. In the test of VIPERS sources with 99% redshift confidence, the classifier gives an accuracy equal to 94% for galaxies, 93% for stars, and 82% for AGNs. The method was applied to sources with low-quality spectra to verify their classification, hence increasing the security of measurements for almost 4900 objects.

Conclusions. We conclude that the SVM algorithm trained on a carefully selected sample of galaxies, AGNs, and stars outperforms simple colour–colour selection methods and can be regarded as a very efficient classification method particularly suitable for modern large surveys.
Original languageEnglish
Article numberA16
Pages (from-to)1-16
JournalAstronomy and Astrophysics
Volume557
DOIs
Publication statusPublished - Sep 2013

Keywords

  • methods: data analysis
  • methods: statistical
  • surveys
  • galaxies: fundamental parameters
  • stars: fundamental parameters
  • cosmology: observations

Cite this