Human evaluation of a German surface realisation ranker
In this paper we present a human-based evaluation of surface realisation alternatives.
We examine the relative rankings of naturally occurring corpus sentences and automatically generated strings chosen by statistical models (language model, loglinear
model), as well as the naturalness of the strings chosen by the log-linear model. We also investigate to what extent preceding context has an effect on choice. We show that native speakers do accept quite some variation in word order, but there are also clearly factors that make certain realisation alternatives more natural.
Cahill, A.; Forst, M. Human evaluation of a German surface realisation ranker. 12th Conference of the European Chapter of the Association of Computational Linguistics. 2009 March 30 - April 3; Athens, Greece.