An efficient framework for learning sentence representations

Lajanugen Logeswaran; Honglak Lee

An efficient framework for learning sentence representations

Lajanugen Logeswaran, Honglak Lee

15 Feb 2018 (modified: 07 Apr 2024)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: In this work we propose a simple and efficient framework for learning sentence representations from unlabelled data. Drawing inspiration from the distributional hypothesis and recent work on learning sentence representations, we reformulate the problem of predicting the context in which a sentence appears as a classification problem. Given a sentence and the context in which it appears, a classifier distinguishes context sentences from other contrastive sentences based on their vector representations. This allows us to efficiently learn different types of encoding functions, and we show that the model learns high-quality sentence representations. We demonstrate that our sentence representations outperform state-of-the-art unsupervised and supervised representation learning methods on several downstream NLP tasks that involve understanding sentence semantics while achieving an order of magnitude speedup in training time.

TL;DR: A framework for learning high-quality sentence representations efficiently.

Keywords: sentence, embeddings, unsupervised, representations, learning, efficient

Code: [![Papers with Code](/images/pwc_icon.svg) 6 community implementations](https://paperswithcode.com/paper/?openreview=rJvJXZb0W)

Data: [COCO](https://paperswithcode.com/dataset/coco), [MPQA Opinion Corpus](https://paperswithcode.com/dataset/mpqa-opinion-corpus), [SICK](https://paperswithcode.com/dataset/sick), [SST](https://paperswithcode.com/dataset/sst)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 6 code implementations](https://www.catalyzex.com/paper/arxiv:1803.02893/code)

15 Replies

Loading