Generating Samples to Probe Trained Models

Generating Samples to Probe Trained Models

ICLR 2026 Conference Submission14621 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Model Understanding, Data Generation

TL;DR: In order to explain trained models, we pose questions in the form of functions on the data space, answers of which are generated data points.

Abstract: There is a growing need for investigating how machine learning models operate. With this work, we aim to understand trained machine learning models by questioning their data preferences. We propose a mathematical framework that allows us to probe trained models and identify their preferred samples in various scenarios including prediction-risky, parameter-sensitive, or model-contrastive samples. To showcase our framework, we pose these queries to a range of models trained on a range of classification and regression tasks, and receive answers in the form of generated data.

Primary Area: interpretability and explainable AI

Submission Number: 14621

Loading