Keywords: language models, edge computing, low-resource computing, neuro-symbolic artificial intelligence
TL;DR: Untrained neural language models scaled down to microprocessors use data collected by a finite state machines to train a voice-controlled smart lamp from verbal commands and button pushes.
Abstract: Language models (LMs) have achieved significant success in centralized settings, but
their utility in localized, real-time applications on edge devices remains constrained. These
environments—where direct interaction between users and devices occurs—lack the vast
training resources available to general-purpose cloud-based models. The typical development pipeline for LMs involves (1) large-scale unsupervised pretraining to develop generalist
behaviors before (2) supervised fine-tuning on small, task-specific datasets. The second step
remains a bottleneck for edge deployment, as it requires labeled data, which is rarely avail-
able or easily collected in situ. We address this challenge by introducing a neuro-symbolic
framework for data collection and learning on edge devices. At the core of our approach is
a finite-state machine (FSM), called a Data Collection Automaton (DCA), that supervises
an LM through interaction with the environment. This FSM enables automatic labeling
of user inputs by tracking conversational and physical interactions, transforming them into
usable training data. Our implementation focuses on a voice-controlled smart lamp that
learns from its user without external data—only through spoken commands and switch
toggles.
Track: Main Track
Paper Type: Industry Abstract
Resubmission: No
Changes List: Additional details for each concept introduced in the overview of the methods are provided to make the submission self-contained.
Publication Agreement: pdf
Submission Number: 92
Loading