Abstract
Object categorization in images is fundamental to various industrial areas, such as automated visual inspection, fast image retrieval and intelligent surveillance. Most existing methods treat visual features (e.g., scale-invariant feature transform, SIFT) as content information of the objects, while regarding image tags as its contextual information. However, the image tags can hardly been acquired in complete unsupervised settings, especially when the image volume is too large to be marked. In this work, we propose a novel and effective method called contextual multivariate information bottleneck (CMIB) to discover object category in totally unlabeled images. Unlike treating image tags as the object’s context, CMIB adopts one feature representation of the images to characterize the object’s content information, while regarding the auxiliary clusterings obtained by other multiple related features as its visual contexts. In the proposed CMIB framework, we borrow the idea of the data compression procedure for object category discovery, which aims to squeeze the source image collection into its compressed representation as much as possible, while maximally preserving the correlative information between the content and visual contexts. Specifically, two Bayesian networks are built to characterize the relationships between data compression and information preservation. Moreover, a sequential informationtheoretic optimization is proposed to ensure the convergence of the CMIB objective function. Extensive experiments on five real-world image data sets show that the proposed method can significantly outperform the state-of-the-art baselines.
Original language | English |
---|---|
Article number | 0 |
Pages (from-to) | 3974-3986 |
Number of pages | 13 |
Journal | IEEE Transactions on Industrial Informatics |
Volume | 16 |
Issue number | 6 |
Early online date | 3 Sept 2019 |
DOIs | |
Publication status | Published - 1 Jun 2020 |