NAACL 2006 WORKSHOP
ON STATISTICAL MACHINE TRANSLATION

Shared Task: Exploiting Parallel Texts for Statistical Machine Translation

Results

[HOME] | [PROGRAM] | [PROCEEDINGS] | [SHARED TASK] | [BASELINE SYSTEM] | RESULTS]

You can find a detailed analysis and summary of the official entries to the shared task in the proceedings. Please cite them or any use of this data as
@InProceedings{koehn-monz:2006:WMT,
  author    = {Koehn, Philipp  and  Monz, Christof},
  title     = {Manual and Automatic Evaluation of Machine Translation between European Languages},
  booktitle = {Proceedings on the Workshop on Statistical Machine Translation},
  month     = {June},
  year      = {2006},
  address   = {New York City},
  publisher = {Association for Computational Linguistics},
  pages     = {102--121},
}
This page contains automatic scores for the extended submissions and a link to the judgment data produced by the human annotators.

Extended submissions

IDParticipant
cmuCarnegie Mellon University, USA (report)
lccLanguage Computer Corporation, USA (report)
msMicrosoft, USA (report)
nrcNational Research Council, Canada (report)
nttNippon Telegraph and Telephone, Japan (report)
raliRALI, University of Montreal, Canada (report)
systranSystran, France
uedin-birchUniversity of Edinburgh, UK --- Alexandra Birch (report)
uedin-phiUniversity of Edinburgh, UK --- Philipp Koehn (report)
upc-jg University of Catalonia, Spain --- Jesús Giménez (report)
upc-jmcUniversity of Catalonia, Spain --- Josep Maria Crego (report)
upc-mr University of Catalonia, Spain --- Marta Ruiz Costa-jussà (report)
upvUniversity of Valencia, Spain (report)
utdUniversity of Texas at Dallas, USA (report)

In the tables below, unofficial submissions are without rank and manual judgments. All submission are available for download.

French-English

SubmissionIn DomainOut-of-domain
Adequacy (rank) Fluency (rank) BLEU (rank) Adequacy (rank) Fluency (rank) BLEU (rank)
upc-jmc+0.19±0.08 (1-7)+0.09±0.08 (1-8)30.42±0.86 (1-6)+0.23±0.09 (1-5)+0.13±0.11 (1-8)21.79±0.92 (1-4)
lcc+0.14±0.07 (1-6)+0.13±0.06 (1-7)30.81±0.85 (1-4)+0.13±0.12 (1-9)+0.11±0.11 (1-9)21.77±0.88 (1-5)
utd+0.13±0.08 (1-7)+0.14±0.07 (1-6)30.53±0.87 (2-7)+0.04±0.10 (1-9)+0.01±0.10 (1-8)21.39±0.94 (3-7)
upc-mr+0.13±0.08 (1-8)+0.13±0.07 (1-6)30.33±0.88 (1-7)+0.12±0.12 (2-8)+0.11±0.10 (1-7)21.95±0.94 (1-3)
nrc+0.12±0.10 (1-7)+0.06±0.11 (2-6)29.62±0.84 (8)-0.03±0.14 (3-8)+0.00±0.11 (3-9)21.15±0.86 (3-7)
nrc2--29.91±0.85--20.25±0.83
nrc3--30.21±0.88--20.76±0.82
ntt+0.11±0.08 (1-8)+0.14±0.08 (2-8)30.72±0.87 (1-7)-0.02±0.12 (3-9)+0.08±0.11 (1-9)21.34±0.85 (3-7)
ntt2--30.03±0.85--20.55±0.86
cmu+0.10±0.08 (3-7)+0.05±0.07 (4-8)30.18±0.80 (2-7)+0.22±0.11 (1-8)+0.13±0.09 (1-9)21.15±0.86 (4-7)
rali-0.02±0.08 (5-8)+0.00±0.08 (3-9)30.39±0.91 (3-7)-0.09±0.12 (4-9)-0.10±0.11 (5-9)20.17±0.85 (8)
systran-0.08±0.09 (9)-0.17±0.09 (8-9)21.44±0.65 (10)+0.19±0.15 (1-8)+0.15±0.14 (1-7)19.42±0.82 (9)
upv-0.76±0.09 (10)-0.52±0.09 (10)24.10±0.89 (9)-0.76±0.16 (10)-0.58±0.14 (10)15.55±0.79 (10)
uedin-phi--31.94±0.86--22.50±0.92

Spanish-English

SubmissionIn DomainOut-of-domain
Adequacy (rank) Fluency (rank) BLEU (rank) Adequacy (rank) Fluency (rank) BLEU (rank)
upc-jmc+0.15±0.08 (1-7)+0.18±0.08 (1-6)31.01±0.97 (1-5)+0.28±0.10 (1-2)+0.17±0.10 (1-6)27.92±0.94 (1-3)
ntt+0.10±0.08 (1-7)+0.10±0.08 (1-8)31.29±0.88 (1-5)+0.11±0.10 (2-7)+0.17±0.10 (2-6)26.85±0.89 (3-4)
ntt2--30.33±0.83--25.07±0.99
lcc+0.08±0.07 (1-8)+0.04±0.06 (2-8)31.46±0.87 (1-4)+0.04±0.10 (4-9)+0.07±0.11 (3-7)27.18±0.92 (1-4)
utd+0.08±0.06 (1-8)+0.08±0.07 (2-7)31.10±0.89 (1-5)+0.03±0.11 (2-9)+0.03±0.10 (2-8)27.41±0.96 (1-3)
nrc+0.06±0.10 (2-8)+0.08±0.07 (1-9)30.04±0.79 (6)+0.18±0.16 (2-8)+0.09±0.09 (1-8)25.40±0.94 (5-7)
nrc3--30.39±0.84--25.89±0.85
upc-mr+0.06±0.07 (1-8)+0.08±0.07 (1-6)29.43±0.83 (7)+0.08±0.11 (2-8)+0.10±0.10 (1-7)25.62±0.87 (5-8)
upc-mr2--30.62±0.86--28.25±0.93
uedin-birch+0.03±0.11 (1-8)-0.07±0.15 (2-10)29.01±0.81 (8)+0.25±0.16 (1-7)+0.18±0.19 (1-6)25.20±0.91 (5-8)
rali+0.00±0.07 (3-9)-0.02±0.07 (3-9)30.80±0.87 (2-5)-0.09±0.11 (4-9)-0.15±0.11 (6-9)25.03±0.91 (6-8)
upc-jg-0.10±0.07 (7-9)-0.11±0.07 (6-9)28.03±0.83 (9)-0.09±0.11 (4-9)-0.09±0.09 (7-9)23.42±0.87 (9)
upv-0.45±0.10 (10)-0.41±0.10 (9-10)23.91±0.83 (10)-0.63±0.14 (10)-0.47±0.11 (10)19.17±0.78 (10)
uedin-phi--32.37±0.88--28.35±0.93

German-English

SubmissionIn DomainOut-of-domain
Adequacy (rank) Fluency (rank) BLEU (rank) Adequacy (rank) Fluency (rank) BLEU (rank)
uedin-phi+0.30±0.09 (1-2)+0.33±0.08 (1)27.30±0.86 (1)+0.22±0.09 (1-6)+0.21±0.10 (1-7)18.87±0.84 (1)
lcc+0.15±0.07 (2-7)+0.12±0.07 (2-7)25.97±0.81 (2)+0.18±0.10 (1-6)+0.20±0.10 (1-7)17.96±0.79 (2-3)
nrc+0.12±0.07 (2-7)+0.14±0.07 (2-6)24.54±0.80 (5-7)+0.04±0.10 (3-8)+0.04±0.09 (2-8)15.93±0.76 (7-8)
nrc3--24.41±0.77 --16.28±0.74
utd+0.08±0.07 (3-7)+0.01±0.08 (2-8)25.44±0.85 (3-4)+0.08±0.09 (2-7)+0.07±0.08 (2-6)16.97±0.76 (4-6)
ntt+0.07±0.08 (2-9)+0.06±0.09 (2-8)25.64±0.83 (3-4)+0.07±0.12 (1-9)+0.21±0.13 (1-7)17.37±0.76 (3-5)
ntt2--25.01±0.79 --17.25±0.76
upc-mr+0.00±0.09 (3-9)-0.21±0.09 (6-9)23.68±0.79 (8)+0.02±0.10 (4-8)-0.11±0.09 (6-8)16.89±0.79 (4-6)
rali-0.01±0.06 (4-9)+0.00±0.07 (3-9)24.60±0.80 (5-7)-0.14±0.08 (8-9)-0.14±0.08 (8-9)15.22±0.69 (8-9)
upc-jmc-0.02±0.09 (2-9)-0.04±0.09 (3-9)24.43±0.86 (5-7)-0.01±0.10 (4-8)-0.04±0.11 (3-9)17.57±0.80 (2-5)
systran-0.05±0.10 (3-9)-0.05±0.09 (3-9)15.86±0.59 (10)+0.30±0.12 (1-4)+0.21±0.12 (1-4)15.56±0.71 (7-9)
upv-0.55±0.09 (10)-0.38±0.08 (10)18.08±0.77 (9)-0.64±0.11 (10)-0.54±0.09 (10)11.78±0.71 (10)

English-French

SubmissionIn DomainOut-of-domain
Adequacy (rank) Fluency (rank) BLEU (rank) Adequacy (rank) Fluency (rank) BLEU (rank)
nrc+0.08±0.09 (1-5)+0.09±0.09 (1-5)31.75±0.83 (1-6)-0.13±0.13 (4-7)-0.16±0.10 (4-7)23.66±0.91 (2-5)
nrc2--30.81±0.82--23.27±0.82
nrc3--31.03±0.85--23.34±0.86
upc-mr+0.08±0.08 (1-4)+0.04±0.07 (1-5)31.50±0.76 (1-6)+0.09±0.11 (2-4)+0.04±0.09 (2-4)23.21±0.75 (2-6)
upc-jmc+0.03±0.09 (1-6)+0.02±0.08 (1-6)31.75±0.78 (1-5)+0.09±0.11 (2-5)+0.09±0.11 (2-4)23.30±0.75 (2-6)
systran-0.01±0.12 (2-7)+0.06±0.12 (1-6)25.07±0.71 (7)+0.50±0.20 (1)+0.41±0.18 (1)25.31±0.88 (1)
utd-0.03±0.07 (3-7)-0.05±0.07 (3-7)31.42±0.85 (3-6)-0.02±0.11 (2-6)-0.05±0.09 (2-6)22.79±0.86 (7)
rali-0.08±0.09 (1-7)-0.09±0.09 (2-7)31.79±0.85 (1-6)-0.12±0.12 (4-7)-0.17±0.12 (5-7)23.34±0.89 (2-6)
ntt-0.09±0.09 (4-7)-0.06±0.08 (4-7)31.92±0.84 (1-5)-0.23±0.12 (4-7)-0.06±0.10 (4-7)22.99±0.96 (3-6)
ntt2--30.79±0.78--21.44±0.90
uedin-phi--33.66±0.81--25.26±0.91

English-Spanish

SubmissionIn DomainOut-of-domain
Adequacy (rank) Fluency (rank) BLEU (rank) Adequacy (rank) Fluency (rank) BLEU (rank)
ms+0.23±0.09 (1-5)+0.13±0.09 (1-7)29.76±0.82 (7-8)+0.33±0.16 (1-7)+0.15±0.13 (1-8)26.15±0.88 (6-7)
upc-mr+0.20±0.09 (1-4)+0.17±0.09 (1-5)31.06±0.86 (1-4)+0.35±0.11 (1-3)+0.19±0.10 (1-6)26.62±0.92 (1-2)
utd+0.18±0.08 (1-5)+0.15±0.08 (1-6)30.73±0.90 (1-4)+0.21±0.13 (2-6)+0.13±0.11 (1-7)25.26±0.78 (3-5)
nrc+0.12±0.09 (2-7)+0.17±0.08 (1-6)29.97±0.86 (5-6)+0.18±0.12 (1-6)+0.07±0.11 (2-7)25.58±0.85 (3-5)
nrc3--30.13±0.87--26.00±0.83
ntt+0.10±0.09 (3-7)+0.14±0.08 (1-6)30.93±0.85 (1-4)+0.12±0.13 (2-7)+0.12±0.13 (1-7)26.52±0.90 (1-2)
ntt2--28.37±0.75--22.59±0.84
upc-jmc+0.04±0.10 (2-7)+0.01±0.08 (2-7)30.44±0.86 (1-4)+0.17±0.15 (2-7)+0.24±0.12 (1-6)25.59±0.95 (3-5)
rali-0.05±0.08 (5-8)-0.03±0.08 (6-8)29.38±0.85 (5-6)-0.17±0.16 (6-8)-0.05±0.13 (4-8)24.03±0.83 (6-8)
uedin-birch-0.18±0.14 (6-9)-0.17±0.13 (6-10)28.49±0.87 (7-8)-0.36±0.24 (6-10)-0.16±0.16 (5-9)23.18±0.88 (7-8)
upc-jg-0.32±0.11 (9)-0.37±0.09 (8-10)27.46±0.78 (9)-0.45±0.13 (8-9)-0.42±0.10 (9-10)22.04±0.84 (9)
upv-0.83±0.15 (9-10)-0.59±0.15 (8-10)23.17±0.73 (10)-1.09±0.21 (9)-0.64±0.19 (8-9)16.83±0.72 (10)
uedin-phi--31.85±0.85--27.76±0.88

English-German

SubmissionIn DomainOut-of-domain
Adequacy (rank) Fluency (rank) BLEU (rank) Adequacy (rank) Fluency (rank) BLEU (rank)
upc-mr+0.28±0.08 (1-3)+0.14±0.08 (1-5)17.24±0.81 (3-5) +0.31±0.13 (2-3)+0.21±0.11 (1-3)10.96±0.70 (1-5)
ntt+0.19±0.08 (1-5)+0.09±0.06 (2-6)18.15±0.89 (1-3) -0.03±0.12 (4-6)+0.08±0.11 (3-5)10.51±0.64 (1-6)
ntt2--18.13±0.81--11.01±0.64
upc-jmc+0.17±0.08 (1-5)+0.13±0.08 (1-4)17.73±0.81 (1-3) +0.22±0.14 (2-3)+0.01±0.10 (3-6)10.64±0.66 (1-6)
nrc+0.17±0.08 (2-4)+0.11±0.08 (1-5)17.52±0.78 (4-5) +0.00±0.11 (4-6)+0.05±0.09 (2-6)10.64±0.65 (2-6)
nrc3--17.44±0.83--10.82±0.57
rali+0.08±0.10 (3-6)+0.03±0.09 (2-6)17.93±0.85 (1-4) +0.13±0.12 (4-6)-0.06±0.10 (4-6)10.57±0.65 (1-6)
systran-0.08±0.11 (5-6)+0.00±0.10 (3-6)9.84±0.52 (7) +0.47±0.15 (1)+0.39±0.15 (1-2)10.78±0.69 (1-6)
upv-0.84±0.12 (7)-0.51±0.10 (7)13.37±0.78 (6) -0.94±0.13 (7)-0.57±0.10 (7)6.55±0.53 (7)
uedin-phi--18.85±0.83--11.82±0.65

Judgment data

The judgments were solicited with an online tool that presented for a randomly selected sentences 5 translations for each sentence from the systems in random order. You can download them.

The file format is triple bar-separated lines with the fields:

Here are the first 10 lines:
WMT06 German-English ||| 1 ||| 614 ||| utd ||| ADEQUACY ||| 1 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| utd ||| FLUENCY ||| 1 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| nrc ||| ADEQUACY ||| 2 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| nrc ||| FLUENCY ||| 4 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| rali ||| ADEQUACY ||| 2 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| rali ||| FLUENCY ||| 4 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| upv ||| ADEQUACY ||| 2 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| upv ||| FLUENCY ||| 5 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| lcc ||| ADEQUACY ||| 2 ||| 62
WMT06 German-English ||| 1 ||| 614 ||| lcc ||| FLUENCY ||| 5 ||| 62