A novel hybrid 2.5D map representation method enabling 3D reconstruction of semantic objects in expansive indoor environments

Shuai Guo, Yazhou Hu, Mingliang Xu, Jinguo Liu, Zhaojie Ju

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents a novel method for creating a hybrid 2.5D semi-semantic map, merging a 2D geometric map with a sparse 3D object map, specifically designed for expansive indoor environments. The primary motivation behind this method is to tackle the issue of temporal and viewpoint discontinuity in RGB-D observations of individual objects within such environments. Notably, current RGB-D SLAM research mainly focuses on small-scale scenarios, often overlooking this specific issue. To address this challenge, our approach proposes to represent objects with a set of object-specific keyframes, and optimizes the spatial relationships between these keyframes in a deferred offline mode. Another objective is to alleviate the accumulation of trajectory drift, which can adversely impact object association/reidentification in expansive environments. To achieve this, we leverage a 2D pose graph SLAM module that indirectly provides initial poses of the RGB-D sensor, thereby facilitating the construction of the sparse 3D object map. Addressing the challenge arising from the dimensional disparity between the two sub-maps (2D vs 3D), we implement a joint optimization strategy to refine the hybrid map, ensuring accuracy compatibility between them. The effectiveness of our proposed method is validated through experiments conducted in real-world environments, and a time efficiency analysis demonstrates the potential of our algorithm to operate in real-time. Note to Practitioners—The motivation of this paper is to create 3D object maps for expansive indoor environments. The challenge arises from the fact that most existing RGB-D SLAM algorithms are designed for small-scale indoor scenarios, lacking suitability for expansive environments with empty areas. In real expansive environments, the short-range capacity of RGB-D sensors increases the risk of failures in continuously observing objects in empty areas, potentially causing the robot to lose its way. This results in two consequences: 1) discontinuity in the viewpoints of individual objects, challenging the online reconstruction of 3D objects; 2) accumulations of trajectory drift during the robot’s loss complicate the reidentification of an object upon its reappearance. Our approach addresses these issues by concurrently generating a 2D geometric map, effectively mitigating trajectory drift. Additionally, we employ a set of object-specific keyframes to represent object reconstruction, efficiently handling viewpoint discontinuity in object observations. Experimental results demonstrate the applicability of our algorithm to real-world environments beyond room-scale scenarios, closely aligning with the challenges encountered in practical applications. The analysis also indicates the potential for real-time operation, rendering our SLAM algorithm advantageous for practical deployment in robotic systems. An additional benefit is that, apart from the 3D object map, our method creates a 2D geometric map, supporting most basic navigation tasks for industrial robots. In future research, we will further improve the time-consuming process and enhance the efficiency of the algorithm.
Original languageEnglish
Number of pages15
JournalIEEE Transactions on Automation Science and Engineering
Early online date16 Dec 2024
DOIs
Publication statusEarly online - 16 Dec 2024

Keywords

  • 3D object detection
  • 3D reconstruction
  • Accuracy
  • Indoor environment
  • Neural radiance field
  • Object SLAM
  • Optimization
  • Robots
  • Semantics
  • Sensors
  • Simultaneous localization and mapping
  • Three-dimensional displays
  • Trajectory
  • Hybrid map

Fingerprint

Dive into the research topics of 'A novel hybrid 2.5D map representation method enabling 3D reconstruction of semantic objects in expansive indoor environments'. Together they form a unique fingerprint.

Cite this