Log RGB Images Provide Invariance to Intensity and Color Balance Variation for Convolutional Networks
Abstract: The interaction of light and matter follows physical rules that have been well-modeled in the vision community. These rules should be available to deep networks when learn- ing vision tasks. However, typical signal processing pipelines, conversion to sRGB, and JPEG compression break the rules and make them unavailable for learning. This, in turn, makes color and intensity unreliable as features and more difficult to use. Using linear or log RGB images that preserve the rules of the physics of reflection should make certain visual tasks simpler to learn and increase robustness to certain types of visual variation.
We demonstrate that using linear RGB or log RGB improves the performance of a deep network on an image classification task when the same network architecture is trained on the same images but in different formats. Furthermore, the linear and log RGB networks are more robust to intensity and color balance variation. In particular, the network trained on log RGB inputs shows invariance to intensity and color balance variation when that variation is not included in the training set, while the network trained on the same images in JPEG format shows severe reductions in performance. We further explore why this difference exists by visualizing low-level features in log RGB, linear RGB, and JPEG data and show that log space preserves certain types of features across intensity and color balance variation.
Loading