Provable Dynamic Regularization Calibration

Zongbo Han; Yifeng Yang; Changqing Zhang; Linjun Zhang; Joey Tianyi Zhou; Qinghua Hu; Huaxiu Yao

Provable Dynamic Regularization Calibration

Zongbo Han, Yifeng Yang, Changqing Zhang, Linjun Zhang, Joey Tianyi Zhou, Qinghua Hu, Huaxiu Yao

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: calibration, regularization

TL;DR: A simple yet effective dynamic regularization method that improves previous regularization-based calibration approaches by informing the model what it should and should not know.

Abstract: Miscalibration in deep learning refers to the confidence of the model does not match the performance. This problem usually arises due to the overfitting issue in deep learning models, resulting in overly confident predictions during testing. Existing methods typically prevent overfitting and mitigate miscalibration by adding a maximum-entropy regularizer to the objective function. The objective of these method can be understood as seeking a model that not only fits the ground-truth labels by increasing the confidence but also maximizes the entropy of predicted probabilities by decreasing the confidence. However, previous methods cannot provide clear guidance on when to increase the confidence (known knowns) or decrease the confidence (known unknowns), leading to the two conflicting optimization objectives (increasing but also decreasing confidence). In this work, we propose a simple yet effective method called dynamic regularization calibration (drc), to address this trade-off by exploring outlier samples within the training set, resulting in a reliable model that can admit it knows somethings and does not know others. drc effectively fits the labels for in-distribution samples while applying regularization to potential outliers dynamically, thereby obtaining robust calibrated model. Both theoretical and empirical analyses demonstrate the superiority of drc compared with previous methods.

Supplementary Material: pdf

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1672

Loading