- TL;DR: We propose a new type of end-to-end trainable attention module, which applies global weight balances among layers by utilizing co-propagating RNN with CNN.
- Abstract: Capturing long-range feature relations has been a central issue on convolutional neural networks(CNNs). To tackle this, attempts to integrate end-to-end trainable attention module on CNNs are widespread. Main goal of these works is to adjust feature maps considering spatial-channel correlation inside a convolution layer. In this paper, we focus on modeling relationships among layers and propose a novel structure, 'Recurrent Layer Attention network,' which stores the hierarchy of features into recurrent neural networks(RNNs) that concurrently propagating with CNN and adaptively scales feature volumes of all layers. We further introduce several structural derivatives for demonstrating the compatibility on recent attention modules and the expandability of proposed network. For semantic understanding on learned features, we also visualize intermediate layers and plot the curve of layer scaling coefficients(i.e., layer attention). Recurrent Layer Attention network achieves significant performance enhancement requiring a slight increase on parameters in an image classification task with CIFAR and ImageNet-1K 2012 dataset and an object detection task with Microsoft COCO 2014 dataset.
- Keywords: attention mechanism, recurrent neural network, image recognition, deep learning