Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual ConceptsDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
Abstract: Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images conditioned on test language and then compare the results with the expected image distribution. One such benchmark, ``Conceptual Coverage Across Languages'' (CoCo-CroLa), assesses the tangible noun inventory of T2I models by prompting them to generate pictures of them in seven input languages and comparing the output image populations. Unfortunately, we find that this benchmark contains translation errors of varying severity in Spanish, Japanese, and Chinese. We provide corrections for these errors and analyze how impactful they are on the utility and validity of CoCo-CroLa as a benchmark. We reassess multiple baseline T2I models with the revisions, compare the outputs elicited under the new translations to those conditioned on the old, and show that a correction's impactfulness on the image-domain benchmark results can be predicted in the text-domain using similarity metrics. Our findings will guide the future development of T2I multilinguality metrics by providing analytical tools for making practical translation decisions.
Paper Type: short
Research Area: Resources and Evaluation
Contribution Types: Model analysis & interpretability, Reproduction study, Data resources, Data analysis, Position papers
Languages Studied: English, Spanish, Chinese, Japanese
0 Replies

Loading