TL;DR: We introduce NeuronTune, which mitigates spurious bias in neural networks by adjusting biased neurons without needing external annotations.
Abstract: Deep neural networks often develop spurious bias, reliance on correlations between non-essential features and classes for predictions. For example, a model may identify objects based on frequently co-occurring backgrounds rather than intrinsic features, resulting in degraded performance on data lacking these correlations. Existing mitigation approaches typically depend on external annotations of spurious correlations, which may be difficult to obtain and are not relevant to the spurious bias in a model. In this paper, we take a step towards self-guided mitigation of spurious bias by proposing NeuronTune, a post hoc method that directly intervenes in a model's internal decision process. Our method probes in a model's latent embedding space to identify and regulate neurons that lead to spurious prediction behaviors. We theoretically justify our approach and show that it brings the model closer to an unbiased one. Unlike previous methods, NeuronTune operates without requiring spurious correlation annotations, making it a practical and effective tool for improving model robustness. Experiments across different architectures and data modalities demonstrate that our method significantly mitigates spurious bias in a self-guided way.
Lay Summary: Modern AI models, like deep neural networks, can sometimes make decisions based on the wrong cues, such as focusing on a background instead of the object itself. This kind of prediction behavior, known as spurious bias, can lead to poor performance when the usual patterns don't appear. Fixing this problem often requires extra information about what the model is doing wrong, which is not always available or reliable. In our work, we introduce NeuronTune, a method that helps models correct these biases on their own. Instead of relying on outside labels, NeuronTune looks inside the model to find the parts responsible for biased decisions and adjusts them directly. We provide theoretical support for our method and show that it makes models more fair and reliable. Tested across different types of data and models, NeuronTune effectively reduces spurious bias without needing extra human guidance.
Link To Code: https://github.com/gtzheng/NeuronTune
Primary Area: Deep Learning->Robustness
Keywords: spurious correlation, robustness, bias mitigation
Submission Number: 12215
Loading