MoteS: Memory Optimization via Fine-grained Scheduling for DNNs on Tiny Devices

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: TinyML, Memory Optimization, Fine-grained Scheduling
TL;DR: We propose a memory optimizer for DNN depolyment on tiny devices via fine-grained graph scheduling
Abstract: There has been a growing trend in deploying deep neural networks (DNNs) on tiny devices. However, it is challenging to do so due to the contradiction of large execution memory requirement of many DNNs and stringent memory constraint of tiny devices. Some previous works incurs large latency overhead to save memory and cannot optimize networks with complex structures; some employ coarse-grained scheduling, leading to limited memory footprint reduction. This paper proposes MoteS that performs fine-grained scheduling via operator partitioning on DNNs to dramatically reduce peak memory usage with little latency overhead. MoteS presents a graph representation named Axis Connecting Graph (ACG) to perform operator partition at graph-level efficiently. MoteS further proposes an algorithm that searches the partition and schedule guided by memory bottlenecks. We evaluate MoteS using various popular networks and show that MoteS achieves up to 80\% of peak memory usage reduction compared to state-of-art works with nearly no latency overhead on tiny devices.
Primary Area: infrastructure, software libraries, hardware, etc.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7457
Loading