Scalable Local Intrinsic Dimension Estimation with Diffusion Models

Published: 17 Jun 2024, Last Modified: 12 Jul 2024ICML 2024 Workshop GRaMEveryoneRevisionsBibTeXCC BY 4.0
Track: Extended abstract
Keywords: Intrinsic dimension estimation, diffusion models, manifold hypothesis
TL;DR: We leverage the Fokker-Planck equation to construct the first highly scalable estimator of local intrinsic dimension using diffusion models.
Abstract: High-dimensional data commonly lies on low-dimensional submanifolds, and estimating the local intrinsic dimension (LID) of a datum is a longstanding problem. LID can be understood as the number of local factors of variation: the more factors of variation a datum has, the more complex it tends to be. Estimating this quantity has proven useful in contexts ranging from generalization in neural networks to detection of out-of-distribution data, adversarial examples, and AI-generated text. While many estimation techniques exist, they are all either inaccurate or do not scale. In this work, we show that the Fokker-Planck equation associated with a diffusion model can provide the first LID estimator which scales to high dimensional data while outperforming existing baselines on LID estimation benchmarks.
Submission Number: 76
Loading