Abstract: In this paper, we address the significant challenge of data scarcity in the field of crack segmentation, a key aspect of structural health monitoring. To tackle this, we introduce the CrackGrowDiff framework, an innovative approach for expanding crack datasets. Utilizing a two-stage controllable generation process that combines a random walk algorithm and semantic diffusion models, our framework minimizes discrepancy of misalignment between synthetic data and original data while enhancing data informativeness. We further ensure the quality and informativeness of synthetic data through feature space diversity, employing a pre-trained Variational Autoencoder (VAE) for selection based on Kullback-Leibler (KL) divergence. Comparative experiments demonstrate CrackGrowDiff’s superiority over traditional data augmentation and GANs-based methods, making it a substantial advancement in addressing the data scarcity in crack segmentation tasks. A DEMO and related code will be made public: https://huggingface.co/spaces/QinLei086/Two-stage-SDM-for-crack-dataset-expending
Loading