README

This zip file contains two videos that illustrate the qualitative results of 3DGraphLLM for solving the 3D referred object grounding task. The videos features a flythrough of scene 0435 from the ScanNet dataset, along with the corresponding RGB observations and the reconstructed point cloud, where the points corresponding to the text query are highlighted in color.

Contents:

Video 1: Demonstrates the performance of our method using ground-truth instance segmentation. The video shows how our approach accurately identifies the referred object within a 3D scene based on natural language descriptions. The baseline approach Chat3Dv2 is not capable to distinguish between two trash bins based on description of their spatial relations with other objects.
Video 2: Provides additional qualitative results with Mask3D instance segmentation, showcasing the method's ability to work with noisy segmentation data. 3DGraphLLM correctly selects a lamp corresponding to user query.

These videos offer a visual representation of the effectiveness and robustness of our method in handling the 3D referred object grounding task, highlighting its precision and adaptability across various scenarios.

For more details, please refer to our accompanying paper.