GPy-ABCD: A Configurable Automatic Bayesian Covariance Discovery Implementation

Thomas Fletcher; Alan Bundy; Kwabena Nuamah

GPy-ABCD: A Configurable Automatic Bayesian Covariance Discovery Implementation

Thomas Fletcher, Alan Bundy, Kwabena Nuamah

Published: 14 Jul 2021, Last Modified: 05 May 2023AutoML@ICML2021 PosterReaders: Everyone

Keywords: Model selection, hyper-parameter optimization, model search, automatic feature extraction, ABCD, Gaussian Process Regression

Abstract: Gaussian Processes (GPs) are a very flexible class of nonparametric models frequently used in supervised learning tasks because of their ability to fit data with very few assumptions, namely just the type of correlation (kernel) the data is expected to display. Automatic Bayesian Covariance Discovery (ABCD) is an iterative GP regression framework aimed at removing the requirement for even this initial correlation form assumption. An original ABCD implementation exists and is a complex stand-alone system designed to produce long-form text analyses of provided data. This paper presents a lighter, more functional and configurable implementation of the ABCD idea, outputting only fit models and short descriptions: the Python package GPy-ABCD, which was developed as part of an adaptive modelling component for the FRANK query-answering system. It uses a revised model-space search algorithm and removes a search bias which was required in order to retain model explainability in the original system.

Ethics Statement: This paper presents a library implementing an improved version of an already existing framework for automatically selecting an interpretable-shape Gaussian Process model for input data. Given the very basic nature of the output (i.e. a list of statistical models with short descriptions) the areas of applications are the same as, say, linear regression or time series analysis, but the real utility lies in extracting the functional shape features of the fit models. In constructing larger systems relying on these identified data features (and their text description), the typical risks of introducing model explainability apply, such as overreliance on (over the models) or misuse of the descriptions by non-domain-experts. At the same time, these models and simple descriptions may provide greater transparency to those affected by decisions made using them and allow a broader non-expert audience to make more informed decisions of their own.

Crc Pdf: pdf

Poster Pdf: pdf

Original Version: pdf

4 Replies

Loading