An Empirical Analysis of Hyperprior Side-Information in Direct Latent-Space Image Classification

An Empirical Analysis of Hyperprior Side-Information in Direct Latent-Space Image Classification

08 May 2026 (modified: 11 May 2026)ICML 2026 Workshop CoLoRAI SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Direct Latent-Space Classification, Variational Autoencoder (VAE), Learned Image Compression, Hyperprior Side-Information, High-Throughput Screening, Representation Learning

TL;DR: This empirical analysis demonstrates that fusing hyperprior side-information in latent-space classification decreases accuracy and increases latency. Primary latents are semantically saturated, making fusion redundant.

Abstract: Executing computer vision tasks directly within the compressed latent space of variational autoencoders (VAEs) offers significant computational advantages by bypassing the decompression bottleneck. In this paper, we investigate the semantic utility of hierarchical hyperpriors—traditionally used for spatial entropy estimation—as a side-information gating mechanism for direct latent-space classification. Utilizing a balanced 100,000-image subset of the AGAR microbial dataset, we demonstrate that a baseline Latent-ResNet operating strictly on primary latents achieves a mean Top-1 accuracy of 96.32\%, closely trailing a pixel-space EfficientNet-B0 (97.13\%). Contrary to theoretical intuition, our proposed Fusion-Gated Hyperprior architecture yields a slight performance degradation (95.57\%) alongside increased total system latency. This empirical ablation study suggests that at the specific compression fidelity of Quality Level 3, primary latent representations are semantically saturated for structural classification tasks, rendering hyperprior variance data redundant and mildly noisy. These findings provide bounded system-design parameters for deploying latency-optimized inference pipelines on pre-compressed data arrays.

Submission Number: 100

Loading