Shared Task: Machine Translation of IT domain

The focus of this task is a domain adaptation of MT to the IT domain and translation of answers in a cross-lingual help-desk service, where hardware&software troubleshooting answers are translated from English to the users' languages.

You may participate in any or all of the following language pairs:


out-of-domain training data
any parallel and monolingual sets from the WMT16 News task or previous years, including
in-domain training data
in-domain test data
Batch3 (Batch3a_en.txt contains the 1000 answers published on April 11 as the test set to be translated by shared task participants; the other files are the reference translations)

If you use additional training data (or existing translation systems that use additional training data), you must flag that your system uses additional data. We will distinguish system submissions that used the provided training data (constrained) from submissions that used significant additional data resources. Linguistic tools such as morphological analyzers, taggers, parsers, word-sense disambiguation or named entity recognizers are allowed in the constrained condition.


Unlike in the News translation task, punctuation in the official test sets will not be altered.

To submit your results, please first convert into into SGML format as required by the NIST BLEU scorer, and then upload it to the website For the conversion of plain-text (one sentence per line) translations into SGML format you can use (or the "old" way: download it-test2016-src.en.sgm and follow News task guidelines).

The translation quality will be measured by a manual evaluation and various automatic evaluation metrics. Participants agree to contribute to the manual evaluation about four hours of work per each submitted system.


QTLeap logo

The in-domain data were created within QTLeap project, which sponsors this task. In case of any questions contact Martin Popel or the WMT mailing list.