Keywords: 3D Intention Grounding;Causal Reasoning;Chain of Thought
Abstract: Accurately matching human intentions in 3D space is an important goal of artificial intelligence. Recently, 3D Intension Grounding (3D-IG) is proposed, aiming to localize target 3D objects that match the given natural language intent.
Compared with traditional visual grounding with a clear goal, the intent is abstract and difficult to understand, posing enormous challenges in detecting corresponding 3D objects.
To this end, towards this task, the model is required to infer the functional attributes of objects from the captured non-descriptive intent and then precisely align attributes to object features.
During this process, existing methods rely on implicit matching, which often suffers from logical gaps. As a result, they fail to establish a clear and interpretable causal reasoning between intention and object, ultimately lowering the robustness and generalizability of the model.
To tackle these challenges, we propose a new method, i.e., Chain-of-Causal Reasoning, which performs intent parsing and grounding along the causal chain.
Specifically, the method decomposes complex intentions step by step along the causal chain into functional requirements, explicitly prioritizing and clarifying latent needs, thereby forming a causal chain from abstract intentions to object attributes and enhancing the accuracy of intent understanding.
Based on this causal chain, we construct an explicit causal graph to establish clear logical relationships between functional requirements and object attributes. Finally, a causal–visual feature alignment mechanism is introduced, which aligns causal features with the geometric–semantic features of 3D point clouds, enabling bidirectional verification between semantic reasoning and visual evidence.
Extensive experiments in 3D Intention Grounding and 3D Visual Grounding tasks demonstrate that our method effectively enhances intent understanding and improves object localization.
Primary Area: causal reasoning
Submission Number: 10264
Loading