Keywords: Coreset Selection, Topological Data Analysis, Persistent Homology, Architectural Invariance, Data-Efficient Learning, Manifold Learning, Pretrained Models
Abstract: Geometric-based dataset pruning is often compromised by architectural variance in feature embeddings. We propose a solution grounded in topological invariance, which first standardizes the data's global manifold before a differentiable persistence-based optimizer distills local sample importance from each point's corrective displacement. The resulting framework yields coresets that are fundamentally robust to the geometric shifts between diverse pretrained models, enabling universal applicability.
Submission Number: 108
Loading