Hadamard Product for Low-rank Bilinear Pooling

Jin-Hwa Kim; Kyoung-Woon On; Woosang Lim; Jeonghee Kim; Jung-Woo Ha; Byoung-Tak Zhang

Hadamard Product for Low-rank Bilinear Pooling

Jin-Hwa Kim, Kyoung-Woon On, Woosang Lim, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang

Published: 06 Feb 2017, Last Modified: 22 Jun 2025ICLR 2017 PosterReaders: Everyone

Abstract: Bilinear models provide rich representations compared with linear models. They have been applied in various visual tasks, such as object recognition, segmentation, and visual question-answering, to get state-of-the-art performances taking advantage of the expanded representations. However, bilinear representations tend to be high-dimensional, limiting the applicability to computationally complex tasks. We propose low-rank bilinear pooling using Hadamard product for an efficient attention mechanism of multimodal learning. We show that our model outperforms compact bilinear pooling in visual question-answering tasks with the state-of-the-art results on the VQA dataset, having a better parsimonious property.

Conflicts: snu.ac.kr, navercorp.com

TL;DR: A new state-of-the-art on the VQA (real image) dataset using an attention mechanism of low-rank bilinear pooling

Keywords: Deep learning, Supervised Learning, Multi-modal learning

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 8 code implementations](https://www.catalyzex.com/paper/hadamard-product-for-low-rank-bilinear/code)

25 Replies

Loading