Tiny Titans: Efficient Large Vision, Language and Multimodal Models Through Pruning

Carolina Tavares, Leandro Giusti Mugnaini, Gustavo Henrique do Nascimento, Ian Pons, Keith Ogawa, Guilherme Stern, Lucas Libanio, Aline Paes, Anna Helena Reali Costa, Artur Jordão

Published: 2025, Last Modified: 26 May 2026SIBGRAPI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Notable progress in solving complex reasoning tasks relies on large models. Unfortunately, developing these models demands substantial computational resources and energy consumption. Hence, the industry pushes the most significant advances in state-of-the-art models and draws the attention of the scientific community to the environmental impact of AI (GreenAI). Pruning emerges as an effective mechanism to address the capacity-computational cost dilemma by eliminating structures (weights, neurons or layers) from deep models. This tutorial introduces theoretical and technical foundations within this promising, active and exciting field. It delves into pruning techniques as a pillar of GreenAI and a foundation for the next wave of efficient large vision, language, and multimodal models. Our tutorial also covers how existing forms of pruning impact efficiency gains, guiding participants to make informed choices for their scenario and infrastructure. Specifically, we equip participants with the basics and key recipes to effectively apply pruning in practical computer vision scenarios. Additional material is available at: github.com/arturjordao/TinyTitans
Loading