MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-EntropiesDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 27 Feb 2024ACL (1) 2023Readers: Everyone
Abstract: Shiyue Zhang, Shijie Wu, Ozan Irsoy, Steven Lu, Mohit Bansal, Mark Dredze, David Rosenberg. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023.
0 Replies

Loading