Bias/Variance is not the same as Approximation/Estimation

Published: 05 Mar 2024, Last Modified: 05 Mar 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: We study the relation between two classical results: the bias-variance decomposition, and the approximation-estimation decomposition. Both are important conceptual tools in Machine Learning, helping us describe the nature of model fitting. It is commonly stated that they are “closely related”, or “similar in spirit”. However, sometimes it is said they are equivalent. In fact they are different, but have subtle connections cutting across learning theory, classical statistics, and information geometry, that (very surprisingly) have not been previously observed. We present several results for losses expressible as Bregman divergences: a broad family with a known bias-variance decomposition. Discussion and future directions are presented for more general losses, including the 0/1 classification loss.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Several final changes, as detailed in reply to action editor below. Funding acknowledgements added. One reference added.
Assigned Action Editor: ~Trevor_Campbell1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1623