ACL 2010
JOINT
FIFTH WORKSHOP ON
STATISTICAL MACHINE TRANSLATION
AND
METRICS MATR
July 15 and 16, 2010
Uppsala, Sweden
[HOME]
| [TRANSLATION TASK]
| [SYSTEM COMBINATION TASK]
| [EVALUATION TASK]
| [RESULTS]
[BASELINE SYSTEM]
| [BASELINE SYSTEM 2]
[SCHEDULE]
| [PAPERS]
| [AUTHORS]
This workshop builds on four previous workshops on statistical machines
translation:
This year's workshop will feature three shared tasks: a shared
translation task for 8 pairs of European languages, a shared evaluation task
to test automatic evaluation metrics, and a system combination task
combining the output of all the systems entered into the shared translation
task. This third task will be evaluated in the context of the NIST MetricsMATR evaluation, with results reported on the final day of the workshop.
The workshop will also feature scientific papers on topics related to MT. Topics of interest include, but are not limited to:
- word-based, phrase-based, syntax-based SMT
- using comparable corpora for SMT
- incorporating linguistic information into SMT
- decoding
- system combination
- error analysis
- manual and automatic method for evaluating MT
- scaling MT to very large data sets
We encourage authors to evaluate their approaches to the above topics using the common data sets created for the shared tasks.
TRANSLATION TASK
The first shared task which will examine translation between the following language pairs:
- English-German and German-English
- English-French and French-English
- English-Spanish and Spanish-English
- English-Czech and Czech-English
Participants may submit translations for any or all of the language directions. In addition to the common test sets the workshop organizers will provide optional training resources, including a newly expanded release of the Europarl corpora and out-of-domain corpora.
All participants who submit entries will have their translations evaluated. We will evaluate translation performance by human judgment. To facilitate the human evaluation we will require participants in the shared tasks to manually judge some of the submitted translations.
A more detailed description of the shared translation task (including
information about the test and training corpora, a freely available MT system,
and a number of other resources) will be available shortly (latest by December 1st). We also provide a
baseline machine translation system, whose performance is comparable to the
best systems from last year's shared task.
SYSTEM COMBINATION TASK
Participants in the system combination task will be provided with the 1-best translations from each of the systems entered in the shared translation task. We will endeavor to provide a held-out development set for system combination, which will include translations from each of the systems and a reference translation. Any system combination strategy is acceptable, whether it selects the best translation on a per sentence basis or create novel translations by combining the systems' translations. The quality of the system combinations will be judged alongside the individual systems during the manual evaluation, as well as scored with automatic evaluation metrics.
EVALUATION TASK
The third task is the shared evaluation task. Developers of automatic machine
translation metrics are invited to submit their software to NIST for evaluation in the second NIST MetricsMATR evaluation. Metrics will be assessed on the MetricsMATR test set. They will also be assessed on output from the WMT shared translation task, for their ability to:
- Rank systems on their overall performance on the test set
- Rank systems on a sentence by sentence level
Participants in the shared evaluation task will use their automatic evaluation metrics to score the output from the translation task and the system combination task. They will be provided with the output from the other two shared tasks along with reference translations. We will measure the correlation of automatic evaluation metrics with the human judgments.
PAPER SUBMISSION INFORMATION
Submissions will consist of regular full papers of 6-10 pages, plus
additional pages for references, formatted following the
ACL 2010 guidelines.
In addition, shared task participants will be invited
to submit short papers (4-6 pages) describing their systems or their evaluation metrics.
Both submission and review processes will be handled electronically.
We encourage individuals who are submitting research papers to evaluate their approaches using the training resources provided by this workshop and past workshops, so that their experiments can be repeated by others using these publicly available corpora.
IMPORTANT DATES
Release of training data | December 1, 2009 |
Test set distributed for translation task | March 1, 2010 |
Results submissions for translation task | March 5, 2010 |
|
Translations release for system combination | March 15, 2010 |
System combinations due | March 26, 2010 |
|
Start of manual evaluation period | March 31, 2010 |
End of manual evaluation | April 30, 2010 |
|
Metric evaluation: metric installation | March 26-May 14, 2010 |
|
Paper submissions (online) | April 23, 2010 |
Notification of acceptance | May 14, 2010 |
Camera-ready deadline | June 4, 2010 |
ANNOUNCEMENTS
Subscribe to to the announcement list for WMT10 by entering your e-mail address below. This list will be used to announce when the test sets are released, to indicate any corrections to the training sets, and to amend the deadlines as needed.
|
|
ORGANIZERS
Chris Callison-Burch (Johns Hopkins University)
Philipp Koehn (University of Edinburgh)
Christof Monz (University of Amsterdam)
Kay Peterson (NIST)
Omar Zaidan (Johns Hopkins University)
INVITED TALK
Hermann Ney (RWTH Aachen)
PROGRAM COMMITTEE
- Steve Abney (University of Michigan)
- Lars Ahrenberg (Linköping University)
- Yaser Al-Onaizan (IBM Research)
- Fazil Ayan (SRI)
- Graeme Blackwood (University of Cambridge)
- Phil Blunsom (University of Oxford)
- Thorsten Brants (Google)
- Chris Brockett (Microsoft Research)
- Bill Byrne (University of Cambridge)
- Michael Carl (University Saarbrücken)
- Marine Carpuat (Columbia University)
- Francisco Casacuberta (University of Valencia)
- David Chiang (ISI/University of Southern California)
- Steve DeNeefe (ISI)
- John DeNero (University of California at Berkeley)
- Kevin Duh (NTT)
- Andreas Eisele (University Saarbrücken)
- Marcello Federico (Fondazione Bruno Kessler)
- George Foster (Canada National Research Council)
- Alex Fraser (University of Stuttgart)
- Michel Galley (Stanford University)
- Daniel Gildea (University of Rochester)
- Jesus Gimenez (Technical University of Catalonia)
- Kevin Gimpel (Carnegie Mellon University)
- Nizar Habash (Columbia University)
- Keith Hall (Google)
- John Henderson (MITRE)
- Abe Ittycheriah (IBM Research)
- Howard Johnson (National Research Council Canada)
- Doug Jones (Lincoln Labs MIT)
- Damianos Karakos (Johns Hopkins University)
- Katrin Kirchhoff (University of Washington)
- Kevin Knight (ISI/University of Southern California)
- Greg Kondrak (University of Alberta)
- Roland Kuhn (National Research Council Canada)
- Shankar Kumar (Google)
- Philippe Langlais (University of Montreal)
- Alon Lavie (Carnegie Mellon University)
- Adam Lopez (University of Edinburgh)
- Wolfgang Macherey (Google)
- Daniel Marcu (ISI/University of Southern California)
- Yuval Marton (Columbia University)
- Evgeny Matusov (Apptek)
- Arne Mauser (Aachen University of Technology)
- Arul Menezes (Microsoft Research)
- Bob Moore (Microsoft Research)
- Smaranda Muresan (Rutgers University)
- Patrick Nguyen (Microsoft Research)
- Chris Quirk (Microsoft Research)
- Stefan Riezler (University of Stuttgart)
- Antti-Veikko Rosti (BBN Technologies)
- Jean Senellart (Systran)
- Libin Shen (BBN Technologies)
- Wade Shen (Lincoln Labs MIT)
- Khalil Simaan (University of Amsterdam)
- Michel Simard (National Research Council Canada)
- David Talbot (Google)
- Joerg Tiedemann (Uppsala University)
- Christoph Tillmann (IBM Research)
- Roy Tromble (Google)
- David Vilar (Aachen University of Technology)
- Clare Voss (Army Research Labs)
- Taro Watanabe (NICT)
- Andy Way (Dublin City University)
- Jinxi Xu (BBN Technologies)
- Richard Zens (Google)
- Bing Zhao (IBM Research)
- Andreas Zollmann (Carnegie Mellon University)
- Adria de Gispert (University of Cambridge)
CONTACT
For questions, comments, etc. please send email to
pkoehn@inf.ed.ac.uk.
supported by the EuroMatrixPlus project
P7-IST-231720-STP
funded by the European Commission
under Framework Programme 7