Abstract: High capacity CNN models trained on large
datasets with strong data augmentation are known
to improve robustness to distribution shifts. How-
ever, in resource constrained scenarios, such as
embedded devices, it is not always feasible to
deploy such large CNNs. Model compression
techniques, such as distillation and pruning, help
reduce model size, however their robustness trade-
offs are not known. In this work, we evaluate sev-
eral distillation and pruning techniques to better
understand their influence on out-of-distribution
performance. We find that knowledge distillation
and pruning combined with data augmentation
help transfer much of the robustness to smaller
models.
0 Replies
Loading