Skilled AI Agents for Embedded and IoT Systems Development

Published: 15 May 2026, Last Modified: 15 May 2026AgentSkills 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large language models, agentic systems, embedded systems, Internet-of-Things, benchmark, hardware-in-the-loop evaluation
Abstract: Large language models (LLMs) and agentic systems have shown promise for automated software development, but applying them to hardware-in-the-loop (HIL) embedded and Internet-of-Things (IoT) systems remains challenging due to the tight coupling between software logic and physical hardware behavior. Code that compiles successfully may still fail on real devices due to timing constraints, peripheral initialization requirements, or hardware-specific behaviors. We introduce a skills-based agentic framework for HIL embedded development together with IoT-SkillsBench, a benchmark that systematically evaluates AI agents in real embedded programming environments across three platforms, 23 peripherals, and 42 tasks at three difficulty levels. Across 378 hardware-validated experiments under three agent configurations (no-skills, LLM-generated skills, and human-expert skills), we show that concise human-expert skills enable near-perfect success rates across platforms.
Presentation Mode: Yes, at least one author will attend and present in person.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 38
Loading