Calibrating Zero-shot Cross-lingual (Un-)structured Predictions

Anonymous

Calibrating Zero-shot Cross-lingual (Un-)structured Predictions

Anonymous

05 Jun 2022 (modified: 05 May 2023)ACL ARR 2022 June Blind SubmissionReaders: Everyone

Keywords: Calibration, Zero-shot Cross-lingual Transfer, Uncertainty

Abstract: We investigate model calibration in the setting of zero-shot cross-lingual transfer with large-scale pre-trained language models. The level of model calibration is an important metric for evaluating the trustworthiness of predictive models. There exists an essential need for model calibration when natural language models are deployed in critical tasks. We study different post-training calibration methods in structured and unstructured prediction tasks. We find that models trained with data from the source language become less calibrated when applied to the target language, and that calibration errors increase with intrinsic task difficulty and relative sparsity of training data. Moreover, we observe a potential connection between the level of calibration error and an earlier proposed measure of the distance from English to other languages. Finally, our comparison demonstrates that among other methods Temperature Scaling (TS) and Gaussian Process Calibration(GPcalib) generalizes well to distant languages, but TS fails to calibrate more complex confidence estimation in structured predictions.

Paper Type: long

Editor Reassignment: no

Reviewer Reassignment: no

0 Replies

Loading