NAACL 2012
SEVENTH WORKSHOP ON
STATISTICAL MACHINE TRANSLATION
June 7-8, 2012
Montreal, Quebec, Canada
[HOME]
| [TRANSLATION TASK]
| [METRICS TASK]
| [QUALITY ESTIMATION TASK]
| [SCHEDULE]
| [PAPERS]
| [AUTHORS]
| [RESULTS]
This workshop builds on six previous workshops on statistical machine
translation:
The workshop is sponsored by the ACL's special interest group in machine
translation (SIGMT).
IMPORTANT DATES
Release of training data | December 9, 2011 |
Test set distributed for translation task | February 27, 2012 |
Submission deadline for translation task | March 2, 2012 |
System outputs distributed for metrics task | March 9,
2012 |
Submission deadline for metrics task | March 30,
2012 |
Paper submission deadline | April 6, 2012 |
Start of manual evaluation period | April 6,
2012 |
Notification of acceptance | April 24, 2012 |
End of manual evaluation | May 1, 2012 |
Camera-ready deadline | May 7, 2012 |
Papers available online | June 1, 2012 |
Workshop in Montreal following NAACL | June 7-8, 2012 |
OVERVIEW
This year's workshop will feature three shared tasks:
- a translation task,
- a quality estimation task, and
- a task to test automatic evaluation metrics.
In addition to the shared tasks, the workshop will also feature scientific papers on topics related to MT.
Topics of interest include, but are not limited to:
- word-based, phrase-based, syntax-based, semantics-based SMT
- using comparable corpora for SMT
- incorporating linguistic information into SMT
- decoding
- system combination
- error analysis
- manual and automatic method for evaluating MT
- scaling MT to very large data sets
We encourage authors to evaluate their approaches to the above topics
using the common data sets created for the shared tasks.
TRANSLATION TASK
The first shared task which will examine translation between the
following language pairs:
- English-German and German-English
- English-French and French-English
- English-Spanish and Spanish-English
- English-Czech and Czech-English
Participants may submit translations for any or all of the language
directions. In addition to the common test sets the workshop organizers
will provide optional training resources, including a newly expanded
release of the Europarl corpora and out-of-domain corpora.
All participants who submit entries will have their translations
evaluated. We will evaluate translation performance by human judgment. To
facilitate the human evaluation we will require participants in the
shared tasks to manually judge some of the submitted translations.
We also provide baseline machine translation systems, with performance
comparable to the best systems from last year's shared task.
QUALITY ESTIMATION TASK
A topic of increasing interest in MT is that of estimating the quality of translated texts. Different from MT evaluation, quality estimation (QE) systems do not rely on reference translations, but rather predict the quality of an unseen translated text (document, sentence, phrase) at system run-time. This topic is particularly relevant from a user perspective: among other applications, it can (i) help decide whether a given translation is good enough for publishing as is (Soricut and Echihabi, 2010); (ii) filter out sentences that are not good enough for post-editing (Specia, 2011); (iii) select the best translation among options from multiple MT and/or translation memory systems (He et al., 2010); and (iv) inform readers of the target language of whether or not they can rely on a translation (Specia et al., 2011).
Although still very recent, research in this topic has been showing promising results in the last couple of years. However, efforts are scattered around several groups and, as a consequence, comparing different systems is difficult as there are neither well established baselines nor standard evaluation metrics. In the Quality-Estimation track of the WMT workshop and shared-task, we will provide training and test sets, along with evaluation metrics and a baseline system. By providing a common ground for development and comparison, we expect to foster research in the topic, as well as to attract new people interested in the subject, who can build and evaluate new solutions using the provided resources.
EVALUATION TASK
The evaluation task will assess automatic evaluation metrics' ability to:
- Rank systems on their overall performance on the test set
- Rank systems on a sentence by sentence level
Participants in the shared evaluation task will use their automatic evaluation metrics to score the output from the translation task and the system combination task. They will be provided with the output from the other two shared tasks along with reference translations. We will measure the correlation of automatic evaluation metrics with the human judgments.
PAPER SUBMISSION INFORMATION
Submissions will consist of regular full papers of 6-10 pages, plus
additional pages for references, formatted following the
NAACL 2012
guidelines. In addition, shared task participants will be invited to
submit short papers (4-6 pages) describing their systems or their
evaluation metrics. Both submission and review processes will be handled
electronically.
We encourage individuals who are submitting research papers to evaluate
their approaches using the training resources provided by this workshop
and past workshops, so that their experiments can be repeated by others
using these publicly available corpora.
ANNOUNCEMENTS
Subscribe to to the announcement list for WMT12 by entering your e-mail address below. This list will be used to announce when the test sets are released, to indicate any corrections to the training sets, and to amend the deadlines as needed.
|
|
You can read past
announcements on the Google Groups page for WMT12. These also
include an archive of annoucements from WMT10 and WMT11. |
|
ORGANIZERS
Chris Callison-Burch (Johns Hopkins University)
Philipp Koehn (University of Edinburgh)
Christof Monz (University of Amsterdam)
Matt Post (Johns Hopkins University)
Radu Soricut (SDL Language Weaver)
Lucia Specia (University of Sheffield)
PROGRAM COMMITTEE
- Steve Abney (University of Michigan)
- Lars Ahrenberg (Linkoeping University)
- Fazil Ayan (SRI International)
- Oliver Bender (RWTH Aachen)
- Nicola Bertoldi (FBK)
- Alexandra Birch (University of Edinburgh)
- Arianna Bisazza (FBK)
- Graeme Blackwood (IBM)
- Ondrej Bojar (Charles University)
- Chris Brockett (Microsoft)
- Michael Carl (Saarland University)
- Marine Carpuat (Columbia University)
- Francisco Casacuberta (University of Valencia)
- Daniel Cer (Stanford University)
- Mauro Cettolo (FBK)
- Boxing Chen (National Research Council Canada)
- Colin Cherry (National Research Council Canada)
- David Chiang (ISI)
- Steve DeNeefe (SDL Language Weaver)
- Michael Denkowski (Carnegie Mellon University)
- Markus Dreyer (SDL Language Weaver)
- Kevin Duh (NAIST)
- Chris Dyer (CMU)
- Yang Feng (Sheffield University)
- Andrew Finch (NICT)
- Jose Fonollosa (University of Catalonia)
- George Foster (National Research Council Canada)
- Alex Fraser (University of Stuttgart)
- Michel Galley (Microsoft)
- Niyu Ge (IBM)
- Ulrich Germann (University of Toronto)
- Daniel Gildea (University of Rochester)
- Kevin Gimpel (CMU)
- Cyril Goutte (National Research Council Canada)
- Barry Haddow (University of Edinburgh)
- Keith Hall (Google)
- Greg Hanneman (Carnegie Mellon University)
- Christian Hardmeier (Uppsala University)
- Xiadong He (Microsoft)
- Yifan He (Dublin City University)
- Kenneth Heafield (Carnegie Mellon University)
- John Henderson (MITRE)
- Hieu Hoang (University of Edinburgh)
- Young-Sook Hwang (SK Telecom)
- Gonzalo Iglesias (University of Cambridge)
- Pierre Isabelle (National Research Council Canada)
- Abe Ittycheriah (IBM)
- Howard Johnson (National Research Council Canada)
- Doug Jones (Lincoln Labs)
- Damianos Karakos (Johns Hopkins University)
- Maxim Khalilov (TAUS)
- Kevin Knight (ISI)
- Greg Kondrak (University of Alberta)
- Roland Kuhn (National Research Council Canada)
- Shankar Kumar (Google)
- Philippe Langlais (Univeristy of Montreal)
- Gregor Leusch (SAIC)
- Zhifei Li (Google)
- Qun Liu (Chinese Academy of Sciences)
- Shujie Liu (Harbin Institute of Technology)
- Zhanyi Liu (Harbin Institute of Technology)
- Klaus Macherey (Google)
- Wolfgang Macherey (Google)
- Daniel Marcu (ISI)
- Jose Marino (University of Catalonia)
- Lambert Mathias (JHU)
- Spyros Matsoukas (Raytheon BBN Technologies)
- Arne Mauser (RWTH Aachen)
- Yashar Mehdad (FBK)
- Arul Menezes (Microsoft)
- Shachar Mirkin (Xerox)
- Bob Moore (Google)
- Dragos Munteanu (SDL Language Weaver)
- Markos Mylonakis (Xerox)
- Preslav Nakov (National University of Singapore)
- Vassilina Nikoulina (Xerox)
- Kemal Oflazer (CMU)
- Sergio Penkale (Dublin City University)
- Kay Peterson (NIST)
- Daniele Pighin (University of Catalonia)
- Maja Popovic (DFKI)
- Chris Quirk (Microsoft)
- Stefan Riezler (University of Heidelberg)
- Marta RuizCosta-Jussa (University of Catalonia)
- Felipe Sanchez-Martinez (University of Alicante)
- Anoop Sarkar (Simon Fraser University)
- Wade Shen (Lincoln Labs)
- Joerg Tiedemann (Uppsala University)
- Christoph Tillmann (IBM)
- Roy Tromble (Google)
- Dan Tufis (Romanian Academy)
- Jakob Uszkoreit (Google)
- Masao Utiyama (NICT)
- David Vilar (RWTH Aachen)
- Martin Volk (University of Zurich)
- Clare Voss (Army Research Labs)
- Haifeng Wang (Baidu)
- Taro Watanabe (NICT)
- Ralph Weischedel (Raytheon BBN Technologies)
- Hua Wu (Baidu)
- Ning Xi (Nanjing University)
- Peng Xu (Google)
- Francois Yvon (LIMSI)
- Daniel Zeman (Charles University)
- Richard Zens (Google)
- Bing Zhang (Raytheon BBN Technologies)
- Hao Zhang (Google)
- Joy Zhang (CMU)
- Josef van Genabith (Dublin City University)
CONTACT
For questions, comments, etc. please send email
to pkoehn@inf.ed.ac.uk.
supported by the EuroMatrixPlus project
P7-IST-231720-STP
funded by the European Commission
under Framework Programme 7