CORNet: Context-Based Ordinal Regression Network for monocular depth estimation

Xuyang Meng, Chunxiao Fan, Yue Ming*, Hui Yu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Monocular depth estimation, as one of the fundamental tasks of computer vision, plays a crucial role in three-dimensional (3D) scene understanding and perception. Usually, deep learning methods recover monocular depth maps using continuous regression manners by minimizing the errors between the ground-truth depth and the predicted depth. However, fine depth features may not be fully captured through layer-by-layer coding, which is prone to low spatial resolution depth maps and insufficient details. Furthermore, it usually converges slowly and suffers from unsatisfactory results. To tackle these issues, we propose a novel model, named context-based ordinal regression network (CORNet), to reconstruct monocular depth maps in the ordinal regression manner with context information in this paper. Firstly, we put forward a novel context-based encoder with a feature transformation (FT) module to learn context information and details from inputs, and output multi-scale feature maps. Then, we design a boundary enhancement module (BEM) with a spatial attention mechanism following each operation of feature fusion, which captures boundary features in the scene to enhance the border depth. Finally, a feature optimization module (FOM) is designed to fuse and optimize the multi-scale features and boundary features to strengthen depth learning. What's more, we introduce an ordinal weighted inference to predict depth maps from probabilities and discretization values. Experiments and results on two challenging datasets, KITTI and NYU Depth V2, demonstrate that our proposed CORNet can estimate monocular depth maps effectively and obtain superior performance in capturing geometric features over existing methods.

Original languageEnglish
Pages (from-to)4841-4853
Number of pages13
JournalIEEE Transactions on Circuits and Systems for Video Technology
Issue number7
Early online date16 Nov 2021
Publication statusPublished - 1 Jul 2022


  • context information
  • monocular depth estimation
  • ordinal regression
  • spatial attention


Dive into the research topics of 'CORNet: Context-Based Ordinal Regression Network for monocular depth estimation'. Together they form a unique fingerprint.

Cite this