Revisiting the Activation Function for Federated Image Classification

Jaewoo Shin; Taehyeon Kim; Se-Young Yun

Revisiting the Activation Function for Federated Image Classification

Jaewoo Shin, Taehyeon Kim, Se-Young Yun

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Federated Learning, Activation Function

TL;DR: We empirically observe that off-the-shelf activation functions used in centralized setting get a totally different order of accuracy in federated learning.

Abstract: Federated learning (FL) has become one of the most popular distributed machine learning paradigms; these paradigms enable training on a large corpus of decentralized data that resides on devices. The recent evolution in FL research is mainly credited to the refinements in training procedures by developing the optimization methods. However, there has been little verification of other technical improvements, especially improvements to the activation functions (e.g., ReLU), that are widely used in the conventional centralized approach (i.e., standard data-centric optimization). In this work, we verify the effectiveness of activation functions in various federated settings. We empirically observe that off-the-shelf activation functions that are used in centralized settings exhibit a totally different performance trend than do federated settings. The experimental results demonstrate that HardTanh achieves the best accuracy when severe data heterogeneity or low participation rate is present. We provide a thorough analysis to investigate why the representation powers of activation functions are changed in a federated setting by measuring the similarities in terms of weight parameters and representations. Lastly, we deliver guidelines for selecting activation functions in both a cross-silo setting (i.e., a number of clients <= 20) and a cross-device setting (i.e., a number of clients >= 100). We believe that our work provides benchmark data and intriguing insights for designing models FL models.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: General Machine Learning (ie none of the above)

10 Replies

Loading