TY - JOUR
T1 - A novel hybrid 2.5D map representation method enabling 3D reconstruction of semantic objects in expansive indoor environments
AU - Guo, Shuai
AU - Hu, Yazhou
AU - Xu, Mingliang
AU - Liu, Jinguo
AU - Ju, Zhaojie
PY - 2024/12/16
Y1 - 2024/12/16
N2 - This paper presents a novel method for creating a hybrid 2.5D semi-semantic map, merging a 2D geometric map with a sparse 3D object map, specifically designed for expansive indoor environments. The primary motivation behind this method is to tackle the issue of temporal and viewpoint discontinuity in RGB-D observations of individual objects within such environments. Notably, current RGB-D SLAM research mainly focuses on small-scale scenarios, often overlooking this specific issue. To address this challenge, our approach proposes to represent objects with a set of object-specific keyframes, and optimizes the spatial relationships between these keyframes in a deferred offline mode. Another objective is to alleviate the accumulation of trajectory drift, which can adversely impact object association/reidentification in expansive environments. To achieve this, we leverage a 2D pose graph SLAM module that indirectly provides initial poses of the RGB-D sensor, thereby facilitating the construction of the sparse 3D object map. Addressing the challenge arising from the dimensional disparity between the two sub-maps (2D vs 3D), we implement a joint optimization strategy to refine the hybrid map, ensuring accuracy compatibility between them. The effectiveness of our proposed method is validated through experiments conducted in real-world environments, and a time efficiency analysis demonstrates the potential of our algorithm to operate in real-time. Note to Practitioners—The motivation of this paper is to create 3D object maps for expansive indoor environments. The challenge arises from the fact that most existing RGB-D SLAM algorithms are designed for small-scale indoor scenarios, lacking suitability for expansive environments with empty areas. In real expansive environments, the short-range capacity of RGB-D sensors increases the risk of failures in continuously observing objects in empty areas, potentially causing the robot to lose its way. This results in two consequences: 1) discontinuity in the viewpoints of individual objects, challenging the online reconstruction of 3D objects; 2) accumulations of trajectory drift during the robot’s loss complicate the reidentification of an object upon its reappearance. Our approach addresses these issues by concurrently generating a 2D geometric map, effectively mitigating trajectory drift. Additionally, we employ a set of object-specific keyframes to represent object reconstruction, efficiently handling viewpoint discontinuity in object observations. Experimental results demonstrate the applicability of our algorithm to real-world environments beyond room-scale scenarios, closely aligning with the challenges encountered in practical applications. The analysis also indicates the potential for real-time operation, rendering our SLAM algorithm advantageous for practical deployment in robotic systems. An additional benefit is that, apart from the 3D object map, our method creates a 2D geometric map, supporting most basic navigation tasks for industrial robots. In future research, we will further improve the time-consuming process and enhance the efficiency of the algorithm.
AB - This paper presents a novel method for creating a hybrid 2.5D semi-semantic map, merging a 2D geometric map with a sparse 3D object map, specifically designed for expansive indoor environments. The primary motivation behind this method is to tackle the issue of temporal and viewpoint discontinuity in RGB-D observations of individual objects within such environments. Notably, current RGB-D SLAM research mainly focuses on small-scale scenarios, often overlooking this specific issue. To address this challenge, our approach proposes to represent objects with a set of object-specific keyframes, and optimizes the spatial relationships between these keyframes in a deferred offline mode. Another objective is to alleviate the accumulation of trajectory drift, which can adversely impact object association/reidentification in expansive environments. To achieve this, we leverage a 2D pose graph SLAM module that indirectly provides initial poses of the RGB-D sensor, thereby facilitating the construction of the sparse 3D object map. Addressing the challenge arising from the dimensional disparity between the two sub-maps (2D vs 3D), we implement a joint optimization strategy to refine the hybrid map, ensuring accuracy compatibility between them. The effectiveness of our proposed method is validated through experiments conducted in real-world environments, and a time efficiency analysis demonstrates the potential of our algorithm to operate in real-time. Note to Practitioners—The motivation of this paper is to create 3D object maps for expansive indoor environments. The challenge arises from the fact that most existing RGB-D SLAM algorithms are designed for small-scale indoor scenarios, lacking suitability for expansive environments with empty areas. In real expansive environments, the short-range capacity of RGB-D sensors increases the risk of failures in continuously observing objects in empty areas, potentially causing the robot to lose its way. This results in two consequences: 1) discontinuity in the viewpoints of individual objects, challenging the online reconstruction of 3D objects; 2) accumulations of trajectory drift during the robot’s loss complicate the reidentification of an object upon its reappearance. Our approach addresses these issues by concurrently generating a 2D geometric map, effectively mitigating trajectory drift. Additionally, we employ a set of object-specific keyframes to represent object reconstruction, efficiently handling viewpoint discontinuity in object observations. Experimental results demonstrate the applicability of our algorithm to real-world environments beyond room-scale scenarios, closely aligning with the challenges encountered in practical applications. The analysis also indicates the potential for real-time operation, rendering our SLAM algorithm advantageous for practical deployment in robotic systems. An additional benefit is that, apart from the 3D object map, our method creates a 2D geometric map, supporting most basic navigation tasks for industrial robots. In future research, we will further improve the time-consuming process and enhance the efficiency of the algorithm.
KW - 3D object detection
KW - 3D reconstruction
KW - Accuracy
KW - Indoor environment
KW - Neural radiance field
KW - Object SLAM
KW - Optimization
KW - Robots
KW - Semantics
KW - Sensors
KW - Simultaneous localization and mapping
KW - Three-dimensional displays
KW - Trajectory
KW - Hybrid map
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=webofscienceportsmouth2022&SrcAuth=WosAPI&KeyUT=WOS:001381453800001&DestLinkType=FullRecord&DestApp=WOS_CPL
U2 - 10.1109/TASE.2024.3510420
DO - 10.1109/TASE.2024.3510420
M3 - Article
SN - 1545-5955
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
ER -