SCDL: Sketch Causal Disentangled Learning for Sketch-Based 3D Shape Retrieval

Published: 2025, Last Modified: 24 Feb 2026IEEE Trans. Circuits Syst. Video Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Sketch-based 3D shape retrieval (SBSR) has been a challenging task for decades, crucially depending on aligning shared semantic attributes between sketches and 3D shapes. Previous efforts mainly aimed at creating a common embedding space to bridge domain gaps. However, sketches’ subjective and abstract nature, known as confounders, potentially reduces learning performance of matching with 3D shapes. To address this issue, in this paper, we propose a sketch causal disentangled learning for SBSR, named SCDL, which introduce causal intervention to explicitly disentangle sketches into the inherent shared semantic part, and other unrelated confounders to classification (styles, abstraction levels, etc.) for the first time. Specifically, we construct a structural causal model (SCM) in the sketch branch under the dual variational autoencoder (VAE) architectures to alleviate confounders negative impact through learning the semantic attributes in the latent variable space. Next, we adopt a learning strategy on the separated semantic latent variables to construct a shared semantic embedding space further to make cross-modal features of the same class more similar, alleviating the cross-modality discrepancies effectively and establishing new state-of-the-art on three benchmarks. Comprehensive experiment results, ablation studies, and visualization validate the effectiveness of our approach.
Loading