Author: Andreas C. Müller, Sarah Guido
year: 2012
page: 392
Format: pdf
Introduction to Machine Learning with Python: A Guide for Data Scientists
Author: Andreas C. Müller, Sarah Guido
year: 2012
page: 392
Format: pdf
Abstract. This article defines a Framework for Machine Translation Evaluation (FEMTI) which
relates the quality model used to evaluate a machine translation system to the purpose and context
of the system. Our proposal attempts to put together, into a coherent picture, previous attempts to
structure a domain characterised by overall complexity and local difficulties. In this article, we
first summarise these attempts, then present an overview of the ISO/IEC guidelines for software
evaluation (ISO/IEC 9126 and ISO/IEC 14598). As an application of these guidelines to machine
translation software, we introduce FEMTI, a framework that is made of two interrelated classifications
or taxonomies. The first classification enables evaluators to define an intended context of use, while
the links to the second classification generate a relevant quality model (quality characteristics and
metrics) for the respective context. The second classification provides definitions of various metrics
used by the community. Further on, as part of ongoing, long-term research, we explain how metrics
are analyzed, first from the general point of view of “meta-evaluation”, then focusing on examples.
Finally, we show how consensus towards the present framework is sought for, and how feedback
from the community is taken into account in the FEMTI life-cycle.