Architecture and programming model support for efficient heterogeneous computing on tigthly-coupled shared-memory clustersDownload PDFOpen Website

2013 (modified: 09 Nov 2022)DASIP 2013Readers: Everyone
Abstract: Modern computer vision and image processing embedded systems exploit hardware acceleration inside scalable parallel architectures, such as tightly-coupled clusters, to achieve stringent performance and energy efficiency targets. Architectural heterogeneity typically makes software development cumbersome, thus shared memory processor-to-accelerator communication is typically preferred to simplify code offloading to HW IPs for critical computational kernels. However, tightly coupling a large number of accelerators and processors in a shared memory cluster is a challenging task, since the complexity of the resulting system quickly becomes too large. We tackle these issues by proposing a template of heterogeneous shared memory cluster which scales to a large number of accelerators, achieving up to 40% better performance/area/watt than simply designing larger main interconnects to accommodate several HW IPs. In addition, following a trend towards standardization of acceleration capabilities of future embedded systems, we develop a programming model which simplifies application development for heterogeneous clusters.
0 Replies

Loading