Abstract: GRAFT is a distributed edge orchestration system that routes structured tasks across heterogeneous personal devices, enabling off-grid workload completion without cloud services while keeping each worker within an explicit input budget. The system provides a sessioned gRPC control plane with device registration, capability-aware policy routing, compute-target inference dispatch (CPU, GPU, NPU), safety-bounded remote execution, and per-request telemetry. We demonstrate GRAFT through a multi-device PDF summarization workflow in which a coordinator partitions a document across four devices, each running a local SLM, and merges partial results through a staged execution plan with explicit synchronization barriers. Our evaluation on a four-device mesh covering silicon from Apple (laptop), Snapdragon X Elite (laptop), Samsung (mobile), and Arduino Q (embedded) hardware shows that all planned tasks execute on their intended devices, stage ordering is preserved, and end-to-end completion time is governed by the slowest parallel worker rather than total document size. New single-device baselines show that a capable laptop-class model can outperform the mesh on raw wall clock, but only the mesh achieves higher aggregate coverage while keeping every worker inside its native budget; a phone-class 1B model either covers substantially less input or degrades badly when pushed beyond budget. The live demonstration exposes routing decisions, per-task progress, device-level timing, and failure semantics in real time, contributing both a novel systems architecture for multi-device agent orchestration and a compelling interactive experience for conference attendees.
Loading