TY - JOUR
T1 - Cluster ensemble selection and consensus clustering: a multi-objective optimization approach
AU - Aktaş, Dilay
AU - Lokman, Banu
AU - İnkaya, Tülin
AU - Dejaegere, Gilles
N1 - 24 month embargo - Elsevier - may be Gold OA via agreement
PY - 2023/10/25
Y1 - 2023/10/25
N2 - Cluster ensembles have emerged as a powerful tool to obtain clusters of data points by combining a library of clustering solutions into a consensus solution. In this paper, we address the cluster ensemble selection problem and design a multi-objective optimization-based solution framework to produce consensus solutions. Given a library of clustering solutions, we first design a preprocessing procedure that measures the agreement of each clustering solution with the other solutions and eliminates the ones that may mislead the process. We then develop a multi-objective optimization algorithm that selects representative clustering solutions from the preprocessed library with respect to size, coverage, and diversity criteria and combines them into a single consensus solution, for which the true number of clusters is assumed to be unknown. We conduct experiments on different benchmark data sets. The results show that our approach yields more accurate consensus solutions compared to full-ensemble and the existing approaches for most data sets. We also present an application on the customer segmentation problem, where our approach is used to segment customers and to find a consensus solution for each segment, simultaneously.
AB - Cluster ensembles have emerged as a powerful tool to obtain clusters of data points by combining a library of clustering solutions into a consensus solution. In this paper, we address the cluster ensemble selection problem and design a multi-objective optimization-based solution framework to produce consensus solutions. Given a library of clustering solutions, we first design a preprocessing procedure that measures the agreement of each clustering solution with the other solutions and eliminates the ones that may mislead the process. We then develop a multi-objective optimization algorithm that selects representative clustering solutions from the preprocessed library with respect to size, coverage, and diversity criteria and combines them into a single consensus solution, for which the true number of clusters is assumed to be unknown. We conduct experiments on different benchmark data sets. The results show that our approach yields more accurate consensus solutions compared to full-ensemble and the existing approaches for most data sets. We also present an application on the customer segmentation problem, where our approach is used to segment customers and to find a consensus solution for each segment, simultaneously.
KW - Multiple objective programming
KW - cluster ensembles
KW - ensemble selection
KW - consensus clustering
M3 - Article
SN - 0377-2217
JO - European Journal of Operational Research
JF - European Journal of Operational Research
ER -