Towards Large Language Models at the Edge on Mobile, Augmented Reality, and Virtual Reality Devices with Unity

Published: 03 Nov 2023, Last Modified: 03 Nov 2023SAGE 2023EveryoneRevisionsBibTeX
Keywords: Edge computing, Large Language Models, Unity Engine, AI Assistant, Deep Learning, Virtual Reality
TL;DR: We benchmarked and integrated large language model running in a unity framework for edge deployments of LLMS
Abstract: This year, there has been a surge in applications powered by Large Language Models (LLMs). Modern LLMs often demand extensive memory and computation for inference. However, ongoing research in quantization and model distillation continues to reduce these processing requirements. Alongside advancements in hardware, it's now becoming feasible to run LLMs on mobile devices, Augmented Reality (AR), and Virtual Reality (VR) platforms. This paper explores developing state-of-the-art LLAMA2 LLM's models across various devices, including laptops, mobiles, AR, and VR systems. We introduce a Unity-based project that enables the compilation of the LLAMA2 model for diverse platforms, marking a novel contribution to the field. Furthermore, we benchmark performance, share best practices to optimize token generation rates and discuss potential future directions.
Submission Number: 3
Loading