TY - JOUR
T1 - Towards reliable object representation via sparse directional patches and spatial center cues
AU - Jian, Muwei
AU - Yu, Hui
N1 - Replace proof with VoR once published
PY - 2023/8/10
Y1 - 2023/8/10
N2 - In the process of image understanding, the human visual system (HVS) performs multiscale analysis on various objects. HVS primarily focuses on marginally conspicuous image patches located within or around distinct objects rather than scanning the image pixels point by point. Inspired by the HVS mechanism, in this paper, we aimed to describe and exploit multiscale decomposition-based patch detection models for automatic visual feature representation and object localization in images. Our investigation into mimicking and modeling the HVS to capture conspicuous sparse patches and their spatial distribution clues makes a profound contribution to the automatic comprehension and characterization of images by machines. This study demonstrates that the sparse patch-based visual representation with spatial center cues is intrinsically tolerant to object positioning and understanding beyond object variations in spatial position, multiresolution, and chrominance, which has significant implications for many vision-based automatic object grabbing and perception applications, such as robotics, human‒machine interaction, and unmanned aerial vehicles (UAVs).
AB - In the process of image understanding, the human visual system (HVS) performs multiscale analysis on various objects. HVS primarily focuses on marginally conspicuous image patches located within or around distinct objects rather than scanning the image pixels point by point. Inspired by the HVS mechanism, in this paper, we aimed to describe and exploit multiscale decomposition-based patch detection models for automatic visual feature representation and object localization in images. Our investigation into mimicking and modeling the HVS to capture conspicuous sparse patches and their spatial distribution clues makes a profound contribution to the automatic comprehension and characterization of images by machines. This study demonstrates that the sparse patch-based visual representation with spatial center cues is intrinsically tolerant to object positioning and understanding beyond object variations in spatial position, multiresolution, and chrominance, which has significant implications for many vision-based automatic object grabbing and perception applications, such as robotics, human‒machine interaction, and unmanned aerial vehicles (UAVs).
KW - image patches
KW - multiscale analysis
KW - object representation
KW - Shearlet transform
KW - visual perception
UR - http://www.scopus.com/inward/record.url?scp=85168828378&partnerID=8YFLogxK
U2 - 10.1016/j.fmre.2023.08.001
DO - 10.1016/j.fmre.2023.08.001
M3 - Article
AN - SCOPUS:85168828378
SN - 2096-9457
JO - Fundamental Research
JF - Fundamental Research
ER -