Prune or Quantize? Layer-wise Compression of Time-Series ECG Foundation Networks

Tushar Shinde; Sudhanshu Gaurhar; Anil Kumar Tiwari

Prune or Quantize? Layer-wise Compression of Time-Series ECG Foundation Networks

Tushar Shinde, Sudhanshu Gaurhar, Anil Kumar Tiwari

Published: 23 Sept 2025, Last Modified: 01 Dec 2025TS4H NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: ECG, Biosignals, Foundation Models, Layer-wise Compression, Adaptive Pruning, Quantization, Edge AI

Abstract: Foundation models for biosignals, such as wearable ECG monitors, face challenges in resource-constrained settings due to high memory and computational demands. We propose an adaptive layer-wise compression framework that combines quantization and pruning to reduce model size while preserving predictive performance. Layer importance, estimated via parameter contribution and weight variance, guides fine-grained assignment of bit-widths and pruning thresholds, balancing efficiency and accuracy across high- and low-sensitivity layers. Experiments on Chapman and CPSC ECG datasets show that our method consistently outperforms fixed global compression schemes, achieving up to 12.10x compression with no loss in performance. Our architecture-agnostic framework scales from lightweight residual networks to large foundation models, enabling real-time, low-resource ECG monitoring. By efficiently deploying foundation models on edge devices, this work advances scalable, physiology-aware biosignal AI for mobile health and clinical applications.

Submission Number: 61

Loading