LUMEN-PRO: Automating Multi-Task Learning on Optical Neural Networks with Weight Sharing and Physical Rotation

Shanglin Zhou; Yingjie Li; Zhijie Shi; CUNXI YU; Caiwen Ding

LUMEN-PRO: Automating Multi-Task Learning on Optical Neural Networks with Weight Sharing and Physical Rotation

Shanglin Zhou, Yingjie Li, Zhijie Shi, CUNXI YU, Caiwen Ding

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Diffractive Optical Neural Network, Automating Multi-task Learning, Weight Sharing

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: LUMEN-PRO, an automating multi-task learning optical neural network framework that optimizes MTL DONN using physical principles.

Abstract: With the demise of Moore's law, the demand for efficient deep neural network accelerators has surged. In addition, the democratization of AI encourages multi-task learning (MTL), demanding more parameters and processing time. To achieve highly energy-efficient MTL, Diffractive Optical Neural Networks (DONNs) have garnered attention due to extremely low energy and high computation speed. However, implementing MTL on DONNs requires manually reconfiguring and replacing specific layers, resulting in rebuilding and duplicating the physical systems. To overcome the challenges, we propose LUMEN-PRO, an automating MTL framework. Specifically, we first propose to automate MTL utilizing an arbitrary backbone DONN and a set of tasks, resulting in a high-accuracy multi-task DONN model with a small memory footprint that surpasses existing MTL methods. Secondly, we leverage the rotatability of the physical system, and replace task-specific layers with the rotation of the corresponding shared layers. This replacement eliminates the storage requirement of task-specific layers, thus further optimizing the memory footprint. LUMEN-PRO provides flexibility in identifying optimal sharing patterns across diverse datasets, facilitating the search for highly energy-efficient DONNs. Experimental results show that LUMEN-PRO provides up to 49.58% higher accuracy and $4\times$ better cost efficiency than single-task and existing cutting-edge DONN approaches on different datasets. It achieves memory lower bound of multi-task learning, i.e., having the same memory storage as the single task model. Compared to technologies such as IBM TrueNorth and Nanophotonic, LUMEN-PRO achieves $10^5\times$ and $10\times$ speedup in throughput, and $5,969\times$ and $680\times$ energy efficiency gain, respectively.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2778

Loading