Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We introduce AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout, designed to address plasticity loss by applying different dropout probabilities across each preactivation interval.
Abstract: Plasticity loss, a critical challenge in neural network training, limits a model's ability to adapt to new tasks or shifts in data distribution. While widely used techniques like L2 regularization and Layer Normalization have proven effective in mitigating this issue, Dropout remains notably ineffective. This paper introduces AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout, designed to address plasticity loss. Unlike Dropout, AID generates subnetworks by applying Dropout with different probabilities on each preactivation interval. Theoretical analysis reveals that AID regularizes the network, promoting behavior analogous to that of deep linear networks, which do not suffer from plasticity loss. We validate the effectiveness of AID in maintaining plasticity across various benchmarks, including continual learning tasks on standard image classification datasets such as CIFAR10, CIFAR100, and TinyImageNet. Furthermore, we show that AID enhances reinforcement learning performance in the Arcade Learning Environment benchmark.
Lay Summary: When we train AI models, we want them to keep learning new things over time. But often, after learning something new, they become less able to adapt—this is called “loss of plasticity”. Our work introduces a simple method called AID (Activation by Interval-wise Dropout), which helps neural networks stay flexible and open to new knowledge, even as they continue to train. AID builds on a popular method called Dropout, which helps prevent overfitting by randomly turning off some parts of the network during training. However, AID improves on this idea by turning off different parts of the network based on certain conditions, rather than purely at random. It encourages the network to behave more like a linear system, which past research shows is naturally good at staying adaptable. We tested AID on standard image recognition problems and tasks where the data or target change over time, like classifying images in CIFAR10, CIFAR100, and TinyImageNet. We also tried it in reinforcement learning environments, where AI agents learn in non-stationary environment. Across these challenges, AID helped models maintain their ability to learn and adjust, showing promise as an easy way to improve the long-term adaptability of AI systems.
Primary Area: Deep Learning
Keywords: loss of plasticity, plasticity, continual learning, reinforcement learning
Submission Number: 14897
Loading