Abstract: Highlights•Vision Transformer-based federated learning improves accuracy in heterogeneous settings.•Multi-head attention alignment enhances fairness for underrepresented clients.•Weighted averaging boosts performance, especially in highly heterogeneous environments.•Vision Transformer approach reduces need for complex multi-level optimization in FL.
Loading