---
title: "ZeRO stage 1 with reduced communication"
sneak_preview: true
tags: training ZeRO English
excerpt: "Partition-aware ZeRO with up to 2x reduction in communication time!"
---

* Partition-aware approach instead of initial implementation that used a global collective (all-reduce)
* Total communication volume reduction 1.5x -> 1x of data parallelism
* Up to 2x reduction in communication time compared to all-reduce

## Further updates coming soon!
