Abstract: Although various 3D datasets with different functions and scales have been proposed
recently, it remains challenging for individuals to complete the whole pipeline of largescale data collection, sanitization, and annotation. Moreover, the created datasets usually
face the challenge of extremely imbalanced class distribution or partial low-quality data
samples. Motivated by this, we explore the procedurally synthetic 3D data generation
paradigm to equip individuals with the full capability of creating large-scale annotated
photogrammetry point clouds. Specifically, we introduce a synthetic aerial photogrammetry point clouds generation pipeline that takes full advantage of open geospatial data
sources and off-the-shelf commercial packages. Unlike generating synthetic data in virtual games, where the simulated data usually have limited gaming environments created
by artists, the proposed pipeline simulates the reconstruction process of the real environment by following the same UAV flight pattern on different synthetic terrain shapes
and building densities, which ensure similar quality, noise pattern, and diversity with real
data. In addition, the precise semantic and instance annotations can be generated fully
automatically, avoiding the expensive and time-consuming manual annotation. Based on the proposed pipeline, we present a richly-annotated synthetic 3D aerial photogrammetry
point cloud dataset, termed STPLS3D, with more than 16 km2 of landscapes and up to
18 fine-grained semantic categories. For verification purposes, we also provide datasets
collected from four areas in the real environment. Extensive experiments conducted on
our datasets demonstrate the effectiveness and quality of the proposed synthetic dataset.
Loading