Abstract: This paper discusses the problem of multilingual evaluation. Using simple statistics, such as average language performance might inject linguistic biases in favor of dominant language families into evaluation methodology. We show that this bias can be found in published works and we demonstrate that linguistically-motivated result visualization can detect it.
Paper Type: short
0 Replies
Loading