Keywords: Dataset, Explainable AI, Evaluation
TL;DR: We introduce FunnyNodules, a synthetic vision dataset inspired by medical image interpretation, designed for the systematic evaluation of AI models as explainability methods.
Abstract: Densely annotated medical image datasets that capture not only diagnostic labels but also the underlying reasoning behind these diagnoses are scarce. Such reasoning-related annotations are essential for developing and evaluating explainable AI (xAI) models that reason similarly to radiologists: making correct predictions for the right reasons. To address this gap, we introduce FunnyNodules, a fully parameterized synthetic dataset designed for systematic analysis of attribute-based reasoning in medical AI models. The dataset generates abstract, lung nodule–like shapes with controllable visual attributes such as roundness, margin sharpness, and spiculation. Target class is derived from a predefined attribute combination, allowing full control over the decision rule that links attributes to the diagnostic class.
We demonstrate how FunnyNodules can be used in model-agnostic evaluations to assess whether models learn correct attribute–target relations, to interpret over- or underperformance in attribute prediction, and to analyze attention alignment with attribute-specific regions of interest.
The framework is fully customizable, supporting variations in dataset complexity, target definitions, class balance, and beyond.
With complete ground truth information, FunnyNodules provides a versatile foundation for developing, benchmarking, and conducting in-depth analyses of explainable AI methods in medical image analysis.
Primary Subject Area: Interpretability and Explainable AI
Secondary Subject Area: Image Synthesis
Registration Requirement: Yes
Reproducibility: https://github.com/XRad-Ulm/FunnyNodules
Visa & Travel: No
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 16
Loading