Foundation Models on a Budget: Approximating Blocks in Large Vision Models

Published: 10 Jun 2025, Last Modified: 15 Jul 2025MOSS@ICML2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: latent representations, representation learning, neural network similarities, classification, foundation models, large models
TL;DR: We propose a framework to identify and approximate transformers blocks through simple transformations, reducing the overall parameter count while ensuring minimal impact on downstream task performance.
Abstract: Foundation Models have shown impressive performance in various tasks and domains, yet they require massive computational resources, raising concerns about accessibility and sustainability. In this paper, we propose Transformer Blocks Approximation (TBA), a novel method that leverages intra-network similarities to identify and approximate transformer blocks in large vision models using only a small amount of training data. TBA replaces these blocks using lightweight, closed-form transformations, without any additional training steps. The proposed method reduces the number of parameters while having minimal impact on the downstream task.
Code: zip
Submission Number: 59
Loading