Abstract
Multi-view and ensemble clustering methods have been receiving considerable attention in exploiting multiple features of data. However, both of these methods have their own set of limitations. Specifically, the performance of
multi-view clustering may degrade due to the conflict between heterogeneous features, while ensemble clustering relies heavily on the quality of basic clusterings since it discovers the final clustering partition without considering the original feature structures of the source data. In this study, we propose a novel clustering scheme called synergetic information bottleneck (SIB) for joint multi-view and ensemble clustering. First, the proposed SIB utilizes multiple original features to characterize data information from different views while exploiting the basic clusterings to relieve the conflict of heterogeneous features. Second, the SIB generally formulates the problem of joint multi-view and ensemble clustering as a function of mutual information maximization, in which the relatedness between the original features and auxiliary basic clusterings is maximally preserved with respect to the final clustering partition. Finally, to optimize the objective function of SIB, a novel “draw-and-merge” optimization method is proposed. In addition, we prove that this novel optimization method can ensure that the objective function of SIB converges to a stable optimal in a finite number of iterations. Extensive experiments conducted on several practical tasks demonstrate that the SIB outperforms the state-of-the-art multi-view and ensemble clustering methods.
multi-view clustering may degrade due to the conflict between heterogeneous features, while ensemble clustering relies heavily on the quality of basic clusterings since it discovers the final clustering partition without considering the original feature structures of the source data. In this study, we propose a novel clustering scheme called synergetic information bottleneck (SIB) for joint multi-view and ensemble clustering. First, the proposed SIB utilizes multiple original features to characterize data information from different views while exploiting the basic clusterings to relieve the conflict of heterogeneous features. Second, the SIB generally formulates the problem of joint multi-view and ensemble clustering as a function of mutual information maximization, in which the relatedness between the original features and auxiliary basic clusterings is maximally preserved with respect to the final clustering partition. Finally, to optimize the objective function of SIB, a novel “draw-and-merge” optimization method is proposed. In addition, we prove that this novel optimization method can ensure that the objective function of SIB converges to a stable optimal in a finite number of iterations. Extensive experiments conducted on several practical tasks demonstrate that the SIB outperforms the state-of-the-art multi-view and ensemble clustering methods.
Original language | English |
---|---|
Journal | Information Fusion |
Early online date | 9 Oct 2019 |
DOIs | |
Publication status | Early online - 9 Oct 2019 |
Keywords
- RCUK
- EPSRC
- EP/N025849/1