How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings
Keywords: neural tangent kernel, compute graphics, scientific computing, fourier feature encodings, multigrid parametric encodings, encodings
TL;DR: We explore the exceptional performance of multigrid parametric encodings, commonly found in compute graphics, through the lens of the neural tangent kernel (NTK).
Abstract: Neural networks that map between low dimensional spaces are ubiquitous in
computer graphics and scientific computing; however, in their naive
implementation, they are unable to learn high frequency information. We present
a comprehensive analysis comparing the two most common techniques for mitigating
this spectral bias: Fourier feature encodings (FFE) and multigrid parametric
encodings (MPE). FFEs are seen as the standard for low dimensional mappings, but
MPEs often outperform them and learn representations with higher resolution and
finer detail. FFE's roots in the Fourier transform, make it susceptible to
aliasing if pushed too far, while MPEs, which use a learned grid structure, have
no such limitation. To understand the difference in performance, we use the
neural tangent kernel (NTK) to evaluate these encodings through the lens of an
analogous kernel regression. By finding a lower bound on the smallest eigenvalue
of the NTK, we prove that MPEs improve a network's performance through the
structure of their grid and not their learnable embedding. This mechanism is
fundamentally different from FFEs, which rely solely on their embedding space to
improve performance. Results are empirically validated on a 2D image regression
task using images taken from 100 synonym sets of ImageNet and 3D implicit
surface regression on objects from the Stanford graphics dataset. Using peak
signal-to-noise ratio (PSNR) and multiscale structural similarity (MS-SSIM) to
evaluate how well fine details are learned, we show that the MPE increases the
minimum eigenvalue by 8 orders of magnitude over the baseline and 2 orders of
magnitude over the FFE. The increase in spectrum corresponds to a 15 dB (PSNR) /
0.65 (MS-SSIM) increase over baseline and a 12 dB (PSNR) / 0.33 (MS-SSIM) increase over the
FFE.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10795
Loading