Abstract: Recently, indoor 3D object detection has shown impressive progress. However, these improvements have come at the cost of increased memory consumption and longer inference times, making it difficult to apply these methods in practical scenarios. To address this issue, knowledge distillation has emerged as a promising technique for model acceleration. In this paper, we propose the VRDistill framework, the first knowledge distillation framework designed for efficient indoor 3D object detection. Our VRDistill framework includes a refinement module and a soft foreground mask operation to enhance the quality of the distillation. The refinement module utilizes trainable layers to improve the quality of the teacher's votes, while the soft foreground mask operation focuses on foreground votes, further enhancing the distillation performance. Comprehensive experiments on the ScanNet and SUN-RGBD datasets demonstrate the effectiveness and generalization ability of our VRDistill framework.
Primary Subject Area: [Content] Media Interpretation
Secondary Subject Area: [Experience] Interactions and Quality of Experience
Relevance To Conference: We employ knowledge distillation techniques to process media-related information in the form of 3D point clouds, which can pave the way for novel approaches to interpreting or creating multimedia content. Our proposed knowledge methods aim to advance the understanding of multimedia quality of experience through lightweight modeling, enhancing interactions and overall user satisfaction.
Supplementary Material: zip
Submission Number: 2661
Loading