Keywords: Interpretability, Fixed points, Dynamic routing, Dynamic input processing, Deep Learning Framework
TL;DR: We introduce MIND model that dynamically adjusts computation based on input complexity using an Introspection Network. It outperforms traditional architectures emulating the brain’s resource allocation for improved efficiency and performance.
Abstract: While the human brain efficiently handles various computations with a limited number of neurons, traditional deep learning networks require a significant increase in parameters to improve performance.
Yet, these parameters are used inefficiently as the networks employ the same amount of computation for inputs of the same size, regardless of the input's complexity.
We address this inefficiency by introducing self-introspection capabilities to the network, enabling it to adjust the number of used parameters based on the internal representation of the task and adapt the computation time based on the task complexity.
This enables the network to adaptively reuse parameters across tasks, dynamically adjusting the computational effort to match the complexity of the input.
We demonstrate the effectiveness of this method on language modeling and computer vision tasks.
Notably, our model achieves 96.62\% accuracy on ImageNet with just a three-layer network, surpassing much larger ResNet-50 and EfficientNet. When applied to a transformer architecture, the approach achieves 95.8\%/88.7\% F1 scores on the SQuAD v1.1/v2.0 datasets at negligible parameter cost.
These results showcase the potential for dynamic and reflective computation, contributing to the creation of intelligent systems that efficiently manage resources based on input data complexity.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9112
Loading