Un-VDNet: unsupervised network for visual odometry and depth estimation

Xuyang Meng, Chunxiao Fan, Yue Ming, Yuan Shen, Hui Yu

    Research output: Contribution to journalArticlepeer-review

    411 Downloads (Pure)

    Abstract

    Monocular visual odometry and depth estimation plays an important role in augmented reality and robots applications. Recently, deep learning technologies have been widely used in these areas. However, most existing works utilize supervised learning which requires large amounts of labeled data, and assumes that the scene is static. In this paper, we propose a novel framework, called as Un-VDNet, based on unsupervised convolutional neural networks (CNNs) to predict camera ego-motion and depth maps from image sequences. The framework includes three sub- networks (PoseNet, DepthNet, and FlowNet), and learns temporal motion and spatial association information in an end-to-end network. Specially, we propose a novel pose consistency loss to penalize errors about the translation and rotation drifts of the pose estimated from the PoseNet. Furthermore, a novel geometric consistency loss, between the structure flow and scene flow learned from the FlowNet, is proposed to deal with dynamic objects in the real-world scene, which is combined with spatial and temporal photometric consistency constraints. Extensive experiments on the KITTI and TUM datasets demonstrate that our proposed Un-VDNet outperforms the state-of-the-art methods for visual odometry and depth estimation in dealing with dynamic objects of outdoor and indoor scenes.
    Original languageEnglish
    Article number063015
    Number of pages11
    JournalJournal of Electronic Imaging
    Volume28
    Issue number6
    DOIs
    Publication statusPublished - 26 Dec 2019

    Fingerprint

    Dive into the research topics of 'Un-VDNet: unsupervised network for visual odometry and depth estimation'. Together they form a unique fingerprint.

    Cite this