Abstract: With the increasing demands of urban digitalization and the rise of low-altitude economy applications, building façade models are expected to evolve from basic geometric representations to detailed structures enriched with fine-grained semantic and structural information. This study proposes a fine-grained workflow for 3-D modeling of urban building façades, integrating three key components: window detection and reconstruction, floor segmentation, and texture optimization. Specifically, 1) we develop a bidirectional adaptive feature fusion network that robustly detects window positions and geometries by enhancing multiscale feature fusion and semantic representation, effectively handling façade complexity and small target variations; 2) we reconstruct missing windows based on alignment patterns and design a rule-driven algorithm that leverages the vertical distribution of the completed window set to achieve automated and consistent floor segmentation; and 3) we implement a texture optimization module that enhances visual realism by correcting distortions and harmonizing façade colors through segmentation and inpainting techniques. The proposed approach was validated on a large-scale real-world 3-D modeling project in Dangshan County. It enables the rapid generation of semantically enriched and visually realistic façade models at city scale, supporting applications such as drone-based logistics, emergency planning, and smart city operations. Experimental results show that the workflow achieves up to 97.38% F1-score in window detection, delivering superior performance compared with mainstream detectors. The optimized façades demonstrate improved visual consistency, as over 90% of wall regions exhibit clearer and more uniform color distributions. In the real-world project, the final reconstructed models achieve horizontal accuracy better than 0.3 m and vertical accuracy better than 0.2 m, fully meeting the standards for fine-grained urban 3-D modeling.
External IDs:doi:10.1109/jstars.2025.3647095
Loading