GOURD: Tensorizing Streaming Applications to Generate Multi-Instance Compute Platforms

Published: 2024, Last Modified: 15 May 2025IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this article, we rethink the dataflow processing paradigm to a higher level of abstraction to automate the generation of multi-instance compute and memory platforms with interfaces to I/O devices (sensors and actuators). Since the different compute instances (NPUs, CPUs, DSPs, etc.) and I/O devices do not necessarily have compatible interfaces on a dataflow level, an automated translation is required. However, in multidimensional dataflow scenarios, it becomes inherently difficult to reason about buffer sizes and iteration order without knowing the shape of the data access pattern (DAP) that the dataflow follows. To capture this shape and the platform composition, we define a domain-specific representation (DSR) and devise a toolchain to generate a synthesizable platform, including appropriate streaming buffers for platform-specific tensorization of the data between incompatible interfaces. This allows platforms, such as sensor edge AI devices, to be easily specified by simply focusing on the shape of the data provided by the sensors and transmitted among compute units, giving the ability to evaluate and generate different dataflow design alternatives with significantly reduced design time.
Loading