DynaLay: An Introspective Approach to Dynamic Layer Selection for Deep Networks

Mrinal Mathur; Sergey M. Plis

DynaLay: An Introspective Approach to Dynamic Layer Selection for Deep Networks

Mrinal Mathur, Sergey M. Plis

Published: 26 Oct 2023, Last Modified: 13 Dec 2023NeurIPS 2023 Workshop PosterEveryoneRevisionsBibTeX

Keywords: Foundational Model, Dynamic Neural Networks, Fixed-Point Iteration, Computational Efficiency, Multi-Task Learning, Model-Agnostic Framework, Implicit Differentiation, Backpropagation, Agent-Based Modeling

TL;DR: DynaLay introduces an introspective, adaptable neural network architecture that uses a reinforcement learning agent to dynamically select layers, optimizing for both computational efficiency and model accuracy.

Abstract: Deep learning models have increasingly become computationally intensive, necessitating specialized hardware and significant runtime for both training and inference. In this work, we introduce DynaLay, a versatile and dynamic neural network architecture that employs a reinforcement learning agent to adaptively select which layers to execute for a given input. Our approach introduces an element of introspection into neural network architectures by enabling the model to recompute the results on more difficult inputs during inference, balancing the amount of expelled computation, optimizing for both performance and efficiency. The system comprises a main model constructed with Fixed-Point Iterative (FPI) layers, which can approximate complex functions with high fidelity, and an agent that chooses among these layers or a no-operation (NOP) action. Unique to our approach is a multi-faceted reward function that combines classification accuracy, computational time, and a penalty for redundant layer selection, thereby ensuring a harmonious trade-off between performance and cost. Experimental results demonstrate that DynaLay achieves comparable accuracy to conventional deep models while significantly reducing computational overhead. Our approach represents a significant step toward creating more efficient, adaptable, and universally applicable deep learning systems.

Submission Number: 42

Loading