Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N9 2021 year

DOI: 10.17587/prin.12.459-469
2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes
D. D. Rukhovich, daniel-rukhovich@yandex.ru, Faculty of Mechanics and Mathematics, Moscow State University, Moscow, 119234, Russian Federation
Corresponding author: Rukhovich Danila D., Postgraduate Student, Faculty of Mechanics and Mathematics, Moscow State University, Moscow, 119234, Russian Federation, E-mail: daniel-rukhovich@yandex.ru
Received on August 24, 2021
Accepted on September 17, 2021

In this paper, we propose a novel method of joint 3D object detection and room layout estimation. The proposed method surpasses all existing methods of 3D object detection from monocular images on the indoor SUN RGB-D dataset. Moreover, the proposed method shows competitive results on the ScanNet dataset in multi-view mode. Both these datasets are collected in various residential, administrative, educational and industrial spaces, and altogether they cover almost all possible use cases. Moreover, we are the first to formulate and solve a problem of multi-class 3D object detection from multi-view inputs in indoor scenes. The proposed method can be integrated into the controlling systems of mobile robots. The results of this study can be used to address a navigation task, as well as path planning, capturing and manipulating scene objects, and semantic scene mapping.

Keywords: machine learning, deep learning, 3D object detection
pp. 459–469
For citation:
Rukhovich D. D. 2D-to-3D Projection for Monocular and Multi-View 3D Multi-class Object Detection in Indoor Scenes, Programmnaya Ingeneria, 2021, vol. 12, no. 9, pp. 459—469.