- Keywords: receptive field, dilated CNN, representation learning
- Abstract: Dilated convolution kernels are constrained by their shared dilation, keeping them from being aware of diverse spatial contents at different locations. We address such limitations by formulating the dilation as trainable weights respect to individual positions. We introduce Pixel-wise Adaptive Dilation (PAD), a light-weighted extension that allows convolution kernels to flexibly adjust receptive fields based on different contents at pixel level. By inferring dilation via modeling inter-layer patterns, PAD-Nets also provide a possible way to partially understand the hierarchical representations of CNNs. Our evaluation results indicate PAD-Nets can consistently outperform their conventional counterparts on various visual tasks.
- Original Pdf: pdf