MINOTAUR: An Edge Transformer Inference and Training Accelerator with 12 MBytes On-Chip Resistive RAM and Fine-Grained Spatiotemporal Power Gating
Abstract: MINOTAUR is the first energy-efficient edge SoC for inference and training of Transformers (and other networks, e.g., CNNs) with all memory on-chip. MINOTAUR leverages a configurable 8-bit posit-based accelerator, fine-grained spatiotemporal power gating enabled by on-chip resistive-RAM (RRAM) for dynamically adjustable bandwidth, and on-chip fine-tuning through full-network low-rank adaptation (LoRA). MINOTAUR achieves an average utilization of 93% and 74% and energy of 8.1 mJ and 8.2 mJ on ResNet-18 and MobileBERT Tiny inference respectively, and on-chip fine-tuning within 1.7% of offline training without RRAM-induced energy limitations or endurance degradations.
Loading