Translating documents between two different languages by computer has been one of the oldest goals in computational linguistics. Now, armed with vast amounts of translated text and powerful computers, we are witnessing significant progress toward achieving that goal.
Statistical methods allow the analysis of parallel corpora and the automatic construction of machine translation systems. For some language pairs such as Chinese-English or Arabic-English, statistical machine translation (SMT) systems built at research labs currently outperform commercial systems.
This workshop focuses on statistical and hybrid methods for machine translation and features a shared translation task. The evaluation of machine translation systems is a growing field and this workshop will also focus on determining the best methodology for evaluating translation quality both with automatic metrics and through subjective human evaluation.
This workshop builds on the success of the 2005 ACL Workshop on Parallel Text and the 2006 NAACL Workshop on Statistical Machine Translation.
Topics of interest include, but are not limited to:
In addition to soliciting research papers on the topics listed above, the workshop will also feature a shared translation task. The workshop organizers will provide common test sets for translation between four language pairs in both directions:
All participants who submit entries will have their translations evaluated. In addition to automatic scoring, we will also evaluate translation performance by human judgment. To facilitate the human evaluation we will require participants in the shared task to manually judge some of the submitted translations.
A more detailed description of the shared task (including information about the test and training corpora, a freely available MT system, and a number of other resources) is available from We also provide a baseline machine translation system, whose performance matches the best systems from last year's shared task.
We encourage individuals who are submitting research papers to evaluate their approaches using the training resources provided by this workshop, so that their experiments can be repeated by others using these publicly available corpora.
Given the overlap of the paper submission timeframe with that of EMNLP 2007, we accept papers that are also submitted to the EMNLP conference, but would like to know as soon as possible after the notification if an accepted paper will be withdrawn.
Regular paper submissions | April 2 |
(shared task) Results submissions | April 6 |
(shared task) Short paper submissions | April 13 |
Notification | April 23 |
Camera-ready papers | May 9 |
![]() | supported by the EuroMatrix project, P6-IST-5-034291-STP funded by the European Commission under Framework Programme 6 |