Abstract: Bilinear pooling has achieved an impressive improvement over classical average and max pooling in many computer vision tasks. Recent studies discover that matrix normalization is vital for improving the performance of bilinear pooling since it effectively suppresses the burstiness. Nevertheless, exiting matrix normalization methods such as matrix square-root and matrix logarithm are based on singular value decomposition (SVD), which is not supported well in the GPU platform, limiting its efficiency in training and inference. To boost the efficiency in the GPU platform, recent methods rely on Newton-Schulz (NS) iteration which approximates the matrix square-root through several times of matrix-matrix multiplications. Despite that Newton-Schulz iteration is well supported by GPU, it takes $\mathcal{O}(KD^3)$ computation complexity where $D$ is dimension of local features and $K$ is the number of iterations, which is still costly. Meanwhile, NS iteration is applicable only to full bilinear matrix. In contrast, a compact bilinear feature obtained from tensor sketch or random projection has broken the matrix structure, cannot be normalized by NS iteration. To overcome these limitations, we propose a rank-1 update normalization (RUN), which reduces the computational cost from $\mathcal{O}(KD^3)$ to $\mathcal{O}(KDN)$ where $N$ is the number of local feature per image. More importantly, it supports the normalization on compact bilinear features. Meanwhile, the proposed RUN is differentiable, and thus it is feasible to plug it in a convolutional neural network as a layer to support an end-to-end training. Comprehensive experiments on four public benchmarks show that, for full bilinear pooling, the proposed RUN achieves comparable accuracies with a $330\times$ speedup over NS iteration. For compact bilinear pooling, our RUN achieves comparable accuracies with a $5400\times$ speedup over the SVD-based normalization.
Code: https://www.dropbox.com/s/ewpvsosz5vqucrx/ICLR2020clean.tar.gz?dl=0
Keywords: Computer Vision, Bilinear Pooling, Efficient Network, Fine-grained Classification
Original Pdf: pdf
11 Replies
Loading