Upper Bound of Bayesian Generalization Error in Partial Concept Bottleneck Model

TMLR Paper2375 Authors

14 Mar 2024 (modified: 25 Apr 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Concept Bottleneck Model (CBM) is a method for explaining neural networks. In CBM, concepts which correspond to reasons of outputs are inserted in the last intermediate layer as observed values. It is expected that we can interpret the relationship between the output and concept similar to linear regression. However, this interpretation requires observing all concepts and increases the generalization error of neural networks. Partial CBM (PCBM), which uses partially observed concepts, has been devised to resolve these difficulties. Although some numerical experiments suggest that the generalization error of PCBMs is almost as low as that of the original neural networks, the theoretical behavior of its generalization error has not been yet clarified because PCBM is singular statistical model. In this paper, we reveal the Bayesian generalization error in PCBM with a three-layered and linear architecture. The result indicates that the structure of partially observed concepts decreases the Bayesian generalization error compared with that of CBM (full-observed concepts).
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We briefly describe the changes as follows. * Additional explanation of RLCT. How to calculate the RLCT when the normal crossing form is obtained. * Additional references of singular learning theory in recent. * Fixing some typos and grammatical errors. In addition, we specify that our research aims to clarify generalization error (via RLCT) and there are some evidences show RLCTs may contribute interpretability of neural network. We will respond to each comment individually later.
Assigned Action Editor: ~Satoshi_Hara1
Submission Number: 2375
Loading