ACE-BERT: Adversarial Cross-Modal Enhanced BERT for E-Commerce Retrieval

Published: 01 Jan 2023, Last Modified: 14 May 2025APWeb/WAIM (4) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Nowadays on E-commerce platforms, products usually contain multi-modal descriptions. To search related products from user-generated texture queries, most previous works learn the multi-modal retrieval models by the historical query-product interactions. However, the tail products with fewer interactions tend to be neglected in the learning process. Recently, product pre-training methods have been proposed to solve this problem. Motivated by these works, we aim to solve two challenges in this area: (1) Existing works share the same network for the user-generated texture query and the product title, which ignores the semantic gap between them; (2) The irrelevant backgrounds in products images would disturb the cross-modal alignment. In this work, we propose the Adversarial Cross-modal Enhanced BERT (ACE-BERT) for E-commerce multi-modal retrieval. In the pre-training stage, ACE-BERT learns multi-modal Transformers by product title, image and additional Hot Query. Specifically, ACE-BERT constructs Hot Query for each product to learn the correlations of products and queries in the pre-training stage. In addition, ACE-BERT detects the significant objects and removes irrelevant backgrounds of the product image, then takes them as image representation. In the fine-tuning stage, ACE-BERT performs semantic matching and adversarial learning tasks to better align the representations of queries and products. Experimental results demonstrate that ACE-BERT outperforms the state-of-the-art approaches on both a public dataset and a real-world application. It is remarkable that ACE-BERT has already been deployed in the search engine, leading to a \(1.46\%\) increase in revenue.
Loading