LOLAMEME: LOGIC, LANGUAGE, MEMORY, MECHANISTIC FRAMEWORK

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: LOGIC, LANGUAGE, MEMORY, MECHANISTIC, FRAMEWORK, LLM, GENERATIVE, AI
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We designed a framework to test language model on various aspects of language in a controllable way. Based on the results, we designed new architecture which has benefits of attention as well as hyena operator.
Abstract: The performance of Large Language Models have achieved superhuman breadth with unprecedented depth. At the same time, the language models are mostly black box models and the underlying mechanisms for performance have been evaluated using synthetic or mechanistic schemes. We extend current mechanis- tic schemes to incorporate Logic, memory, and nuances of Language such as la- tent structure. The proposed framework is called LOLAMEME and we provide two instantiations of LOLAMEME: LoLa and MeMe languages. We then con- sider two generative language model architectures: transformer-based GPT-2 and convolution-based Hyena. We propose the hybrid architecture T HEX and use LO- LAMEME framework is used to compare three architectures. T HEX outperforms GPT-2 and Hyena on select tasks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2772
Loading