Multi-modal feature fusion for 3D object detection in the production workshop

Published: 01 Jan 2022, Last Modified: 05 Mar 2025Appl. Soft Comput. 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A production workshop object dataset (PWOD) with RGB and depth image samples is established.•A 2D object detection training method based on an improved YOLOv3 and transfer learning improves the detection of multiscale objects.•RGB-D object saliency detection is used to obtain the object pixels in the RGB image. This can more accurately locate the object and obtain the object pixels than deep learning semantic segmentation methods, especially when there is occlusion.•Multi-modal feature fusion of 2D RGB features with density distribution features of frustum point cloud of an object generates a 3D bounding box.
Loading