Reassessing EMNLP 2024’s Best Paper: Does Divergence-Based Calibration for MIAs Hold Up?

Published: 23 Jan 2025, Last Modified: 26 Feb 2025ICLR 2025 Blogpost TrackEveryoneRevisionsBibTeXCC BY 4.0
Blogpost Url: https://d2jud02ci9yv69.cloudfront.net/2025-04-28-calibrated-mia-100/blog/calibrated-mia/
Abstract: At EMNLP 2024, the [Best Paper Award](https://x.com/emnlpmeeting/status/1857176180128198695/photo/1) was given to **"Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method"**. The paper addresses Membership Inference Attacks (MIAs), a key issue in machine learning related to privacy. The authors propose a new calibration method and introduce **PatentMIA**, a benchmark utilizing temporally shifted patent data to validate their approach. The method initially seems promising: it recalibrates model probabilities using a divergence metric between the outputs of a target model and a token-frequency map derived from auxiliary data, claiming improved detection of member and non-member samples. However, upon closer examination, we identified significant shortcomings in both the experimental design and evaluation methodology. In this post, we critically analyze the paper and its broader implications.
Conflict Of Interest: We have no conflict of interest with the paper that is analyzed.
Submission Number: 63
Loading