Automatic Design of LSTM Networks with Skip Connections through Evolutionary and Differentiable Architecture Search
Abstract: The long short term memory (LSTM) network is a popular deep learning model with a wide range of applications. Skip connection is a promising and important architectural innovation that can noticeably improve the performance of LSTM networks on complex machine learning tasks with longterm temporal/spatial dependencies. However, it is difficult to manually design skip connections in deep LSTM networks. For example, iterative analysis of skip connections is impractical and has scalability challenges. This paper proposes a new approach based on genetic algorithm (GA) and differentiable architecture search to automatically design LSTM networks with suitable skip connections. To allow an LSTM network and its skip connections to be designed jointly, we relax the search space of skip connections to be continuous. Consequently, LSTM network architectures with appropriate skip connections can be optimized directly through gradient-based network training. The cost of designing skip connections can also be maintained at a low level. Experiment results obtained from various classification and regression tasks show that the proposed algorithm excels in evolving LSTM networks that can significantly outperform several state-of-the-art models.
Loading