Abstract: In edge computing, optimizing data mobility is crucial for minimizing latency and maximizing bandwidth efficiency by processing data closer to its source . This reduces the need to transfer large amounts of data to centralized servers, enabling faster insights and real-time decision-making. However, data mobility challenges in embedded machine learning systems arise from inefficient data transfer, software, and algorithmic issues that result in latency, CPU utilization that hinders Internet of Things (IoTs) performance and reliability. This paper introduces ADMEdge (Accelerating Data Mobility at the Edge), a novel software/hardware co-design framework aimed at optimizing data movement in FPGA-based System-on-Chip (SoC) edge devices through the utilization of Direct Memory Access (DMA), buffer, and shared memory techniques. Testing is conducted on a Kria KV260 IoT SoC, which incorporates a PicoRV32 RISC-V core and a Deep Neural Network accelerator, using representative datasets to simulate real-world scenarios. Incorporating shared memory significantly reduces latency by removing the recompilation step, leading to a 9.5-fold speedup in real-time image processing at higher workloads. Alongside a 49% performance gain from buffering techniques, data throughput scales efficiently with larger data sizes, with DMA demonstrating up to 226.98 times faster communication in specific scenarios. Overall, these optimizations yield a 44% improvement in system performance, highlighting the potential of ADMEdge to enhance edge computing applications in the Internet of Things.
Loading