Analyzing Neural Style Representations for Unsupervised Clustering: Visual Art as a Testbed

ICLR 2026 Conference Submission12709 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Style based clustering, Neural representations for style, Visual art
TL;DR: How well do neural networks capture artistic style? We present the first comprehensive study of style representations for unsupervised artwork clustering.
Abstract: Neural networks claim to capture artistic style, but it remains unclear whether their representations can organize artworks in unsupervised settings, or which aspects of style they truly encode. We present the first comprehensive analysis of neural style representations for unsupervised clustering of visual artworks. Our study systematically compares representations derived from classification networks, generative models, diffusion architectures, and vision-language systems, including our novel language-based features. Using both real-world artwork collections and synthetically curated datasets, we evaluate how effectively these representations capture style across multiple definitions. Our results show that specialized style representations consistently outperform generic embeddings, yet no single representation works across all style definitions. This variability reflects the inherent ambiguity of “style” itself, revealing a gap between human perception, art-historical categories, and machine-learned features. Taken together, our findings position visual art as a rigorous testbed for advancing unsupervised representation learning with broader implications for digital curation, cultural heritage, and style-aware computer vision.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 12709
Loading