Abstract: In the field of road extraction, an dominant network is D-LinkNet which won the first place in DeepGlobe 2018 challenge. Although D-LinkNet creatively proposed D-block with progressively enlarged dilated convolution and proved its efficiency, the <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathrm{N}\times \mathrm{N}$</tex> square kernel it used still has limitations for road extraction. Road in aerial imagery usually has narrow-and-long shape, and the direction is randomly distributed. Therefore, not only large receptive field but also anisotropic long-range contextual information should be considered. Based on this intuition, we integrate a new pooling strategy named strip pooling which uses a long but narrow kernel i.e. <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$1\times \mathrm{N}$</tex> or <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathrm{N}\times 1$</tex> into D-LinkNet. With strip pooling module(SPM) and mixed pooling module(MPM) designed based on strip pooling, two modifications are made to D-LinkNet: 1) We insert SPM into Res-block of the original encoder ResNet34 and name it Res-SPM-block. 2) Inspired by MPM, we connect strip pooling in parallel with D-block and name it SPD-block. We name the upgraded D-LinkNet as SPD-LinkNet. Experimental results on DeepGlobe 2018 dataset prove that SPD-LinkNet outperforms original D-LinkNet in accuracy while maintaining nearly the same inference speed.
0 Replies
Loading