TL;DR: ELENAS utilizes elementary mathematical operations to form novel building blocks for deep neural networks that achieve strong generalization.
Abstract: Deep neural networks typically rely on a few key building blocks such as feed-forward, convolution, recurrent, long short-term memory, or attention blocks. On an elementary level, these blocks consist of a relatively small number of different mathematical operations. However, as the number of all combinations of these operations is immense, crafting such novel building blocks requires profound expert knowledge and is far from being fully explored. We propose Elementary Neural Architecture Search (ELENAS), a method that learns to combine elementary mathematical operations to form new building blocks for deep neural networks. These building blocks are represented as computational graphs, which are processed by graph neural networks as part of a reinforcement learning system. Our approach contrasts the current research direction of Neural Architecture Search, which mainly focuses on designing neural networks by altering and combining a few, already established, building blocks. In a set of experiments, we demonstrate that our method leads to efficient building blocks that achieve strong generalization and transfer well to real-world data. When stacked together, they approach and even outperform state-of-the-art neural networks at several prediction tasks. Our underlying methodological framework offers high flexibility and broad applicability across domains while requiring relatively small computational costs. Consequently, it has the potential to find novel building blocks that become of general importance for machine learning practitioners beyond specific data or use cases.
Keywords: AutoML, NAS, Neural Architecture Search, ELENAS, Elementary Neural Architecture Search, Elementary Mathematical Operations, Deep Learning
Submission Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
Steps For Environmental Footprint Reduction During Development: Our method utilizes synthetic datasets during the search process to minimize the environmental footprint. Also, our implementation is efficient and includes early stopping techniques to terminate the search process in the event of undesirable behavior, further reducing computational costs. Moreover, the resultant building blocks are parameter-efficient and thus require fewer computational resources when deployed.
CPU Hours: 400
GPU Hours: 400
TPU Hours: 0
Evaluation Metrics: No