TY - JOUR
T1 - A biologically-inspired sparse self-representation approach for projected fuzzy double c-means clustering
AU - Tian, Xin
AU - Sun, Cun
AU - Sun, Ying
AU - Song, Yan
AU - Wei, Guoliang
AU - Yu, Hui
AU - Li, Ming
PY - 2023/8/15
Y1 - 2023/8/15
N2 - Data redundancy is frequently encountered in biologically data. Locality preserving projection (LPP) is a dimensionality reduction approach to mitigate the data redundancy while preserving the substantial geometry inspired by biological processes. Its application can contribute promisingly to the fuzzy c-means (FCM) clustering. However, the existing locality preserving based FCM clustering methods that combine LPP with FCM focus only on the local information, probably resulting in somewhat conservatism. A novel FCM clustering method, namely, projected fuzzy double c-means clustering using sparse self-representation (PFD SSR), is developed in this paper. The main idea of PFD SSR is three-fold: (1) Inspired by biological processes, a so-called sparse self-representation (SSR) method is employed. Hence, the global data distribution is investigated so as to enhance the clustering performance; (2) LPP is utilized to handle both the raw data and the dictionary matrix obtained by SSR, which greatly reduces the feature dimensions and solidly preserves the intrinsic data distribution. In addition, the regularization terms of these two achievements under projection are introduced to the FCM’s objective function, which helps reduce the risk of being trapped into local optima during the model training; and (3) the alternative direction technique is applied to learn the model. The experimental results on 11 datasets including 6 biologically data sets demonstrated the proposed method outperforms the state-of-art clustering methods. The proposed subspace clustering method has a good ability of handling the high-dimensional data, especially biological data.
AB - Data redundancy is frequently encountered in biologically data. Locality preserving projection (LPP) is a dimensionality reduction approach to mitigate the data redundancy while preserving the substantial geometry inspired by biological processes. Its application can contribute promisingly to the fuzzy c-means (FCM) clustering. However, the existing locality preserving based FCM clustering methods that combine LPP with FCM focus only on the local information, probably resulting in somewhat conservatism. A novel FCM clustering method, namely, projected fuzzy double c-means clustering using sparse self-representation (PFD SSR), is developed in this paper. The main idea of PFD SSR is three-fold: (1) Inspired by biological processes, a so-called sparse self-representation (SSR) method is employed. Hence, the global data distribution is investigated so as to enhance the clustering performance; (2) LPP is utilized to handle both the raw data and the dictionary matrix obtained by SSR, which greatly reduces the feature dimensions and solidly preserves the intrinsic data distribution. In addition, the regularization terms of these two achievements under projection are introduced to the FCM’s objective function, which helps reduce the risk of being trapped into local optima during the model training; and (3) the alternative direction technique is applied to learn the model. The experimental results on 11 datasets including 6 biologically data sets demonstrated the proposed method outperforms the state-of-art clustering methods. The proposed subspace clustering method has a good ability of handling the high-dimensional data, especially biological data.
KW - Dimension reduction
KW - Fuzzy c-means clustering
KW - Locality preserving projection
KW - Regularization
KW - Sparse self-representation
UR - http://www.scopus.com/inward/record.url?scp=85168082749&partnerID=8YFLogxK
U2 - 10.1007/s12559-023-10185-w
DO - 10.1007/s12559-023-10185-w
M3 - Article
AN - SCOPUS:85168082749
SN - 1866-9956
JO - Cognitive Computation
JF - Cognitive Computation
ER -