Abstract: End-to-end models capable of handling multiple subtasks in parallel have become a new trend, thereby presenting significant challenges and opportunities for the integration of multiple tasks within the domain of 3-D vision. The limitations of 3-D data acquisition conditions have not only restricted the exploration of many innovative research problems but have also caused existing 3-D datasets to predominantly focus on single tasks. This has resulted in a lack of systematic approaches and theoretical frameworks for 3-D multitask learning, with most efforts merely serving as auxiliary support to the primary task. In this article, we introduce WHU-Synthetic, a large-scale 3-D synthetic perception dataset designed for multitask learning, from the initial data augmentation (upsampling and depth completion), through scene understanding (segmentation), to macrolevel tasks (place recognition and 3-D reconstruction). Collected in the same environmental domain, we ensure inherent alignment across subtasks to construct multitask models without separate training methods. In addition, we implement several novel settings, making it possible to realize certain ideas that are difficult to achieve in real-world scenarios. This supports more adaptive and robust multitask perception tasks, such as sampling on city-level models, providing point clouds with different densities, and simulating temporal changes. Using our dataset, we conduct several experiments to investigate mutual benefits between subtasks, revealing new observations, challenges, and opportunities for future research. The dataset is accessible at: https://github.com/WHU-USI3DV/WHU-Synthetic.
Loading