Abstract: Shared Task Evaluation Challenges (stecs) have only recently begun in the field of nlg. The tuna stecs, which focused on Referring Expression Generation (reg), have been part of this development since its inception. This chapter looks back on the experience of organising the three tuna Challenges, which came to an end in 2009. While we discuss the role of the stecs in yielding a substantial body of research on the reg problem, which has opened new avenues for future research, our main focus is on the role of different evaluation methods in assessing the output quality of reg algorithms, and on the relationship between such methods.
Loading