Discriminating feature ratio: Introducing metric for uncovering vulnerabilities in deep convolutional neural networks

Tomasz Szandała, Henryk Maciejewski

Published: 24 Oct 2024, Last Modified: 20 May 2026OpenReview Archive Direct UploadEveryoneCC BY-NC 4.0

Abstract: Rapid advancement in Machine Learning (ML) and Artificial Intelligence (AI) has brought increased attention to AI technologies’ potential vulnerability and reliability. This paper identifies the threat of the network incorrectly relying on the latent features that can stay undetectable during validation but cause severe issues in life applications and thus susceptible to adversarial attacks. Furthermore, we propose a semi-automated method to identify this hazard, relying on the Gradual Extrapolation technique to sharpen saliency maps. The method combines well-known techniques: object detection tools and saliency map obtaining formula joined with an enhancement to highlight the vital object, not the area. The proposed method introduces a new metric, the ”Discriminating Feature Ratio” (DFR), to assess the extent to which a model learns only a fraction of an object, such as a specific feature instead of the whole object. By detecting classes with outlying DFRs, the method highlights potential cases where the model’s understanding is limited to certain features, prompting further scrutiny and improved model reliability. The study on ImageNet trained models demonstrates the effectiveness of the DFR in identifying problematic classes and potential vulnerabilities even in state of the art networks. Overall, the introduced method provides valuable insights into deep neural network decision-making processes, uncovering classes with spurious or latent feature reliance issues and potential vulnerabilities. It can serve as simple validation mechanism, similar to accuracy, precision, and recall for a large-scale datasets trained models. By enhancing the transparency and interpretability of deep learning models, this research contributes to improving the robustness and reliability of CNNs in real-world applications. Datasets and code are available publicly on GitHub: https://github.com/szandala/Latent-Features-Detection