Transformer-based automated segmentation of recycling materials for semantic understanding in construction
Abstract: Construction sites are incorporating cameras to gather imagery data for project management. While transformerbased deep models show promise in recognizing construction objects and understanding the environment, their
use in construction images is largely unexplored. This paper presents a systematic evaluation of three state-ofthe-art transformer-based models for automatic segmentation and recognition of construction images. Further,
a two-stage model ensembling strategy based on model averaging and probability weighting is introduced and
implemented for performance improvement. A dataset containing five classes of recycling materials on construction sites is created as a benchmark to compare their performance. The comparison results indicate the
ensemble model could achieve encouraging results with a mIoU of 82.36% and mPA of 90.30%, which
demonstrate superior segmentation performance on construction images.
Loading