Abstract: Processing-in-memory (PIM) has been explored for decades by computer architects, yet it has never seen the light of day in real-world products due to its high design overheads and lack of a killer application. With the advent of critical memoryintensive workloads, several commercial PIM technologies have been introduced to the market, ranging from domain-specific PIM architectures to more general-purpose PIM architectures. In this work, we deepdive into UPMEM's commercial PIM technology, a general-purpose PIM-enabled parallel computing architecture that is highly programmable. Our first key contribution is the development of a flexible simulation framework for PIM. The simulator we developed (aka uPIMulator) enables the compilation of UPMEM-PIM source codes into its compiled machine-level instructions, which are subsequently consumed by our cycle-level performance simulator. Using uPIMulator, we demystify UPMEM's PIM design through a detailed characterization study. Finally, we identify some key limitations of the current UPMEM-PIM system through our case studies and present some important architectural features that will become critical for future PIM architectures to support.
External IDs:dblp:conf/hpca/HyunKLR24
Loading