TL;DR: We investigate inductive biases for learning representations with explicit structure, and are able to significantly improve performance while making our model even simpler.
Abstract: We present Banyan, a model that efficiently learns semantic representations by leveraging explicit hierarchical structure. While transformers excel at scale, they struggle in low-resource settings. Conversely recent structured models have shown promise as efficient learners, but lack performance. Banyan bridges this gap with two key innovations: an entangled hierarchical tree structure and diagonalized message passing, enabling it to outperform larger transformer models with just 14 non-embedding parameters. It excels in low-resource settings, offering a viable alternative for under-represented languages and highlighting its potential for efficient, interpretable NLP in resource-constrained environments.
Lay Summary: We present a model called Banyan, which utilises a new architecture to efficiently learn embeddings - useful for comparing how similar to pieces of text are to each other, for example in search or retrieval. For languages that are well resourced like English, existing models already do well, but in low resource settings they fail because they really on scale to succeed. It is precisely here that Banyan shines, because it can be optimised without much compute or data and still be highly effective.
Primary Area: General Machine Learning->Representation Learning
Keywords: Representation Learning, Structure, Semantics, Syntax, Induction, Composition
Submission Number: 9625
Loading