ACL 2008 THIRD WORKSHOP
ON STATISTICAL MACHINE TRANSLATION
Shared Task: Machine Translation for European Languages
June 19, in conjunction with ACL 2008 in Columbus, Ohio
[
HOME
] | [
SHARED TRANSLATION TASK
] | [
SHARED EVALUATION TASK
] | [
RESULTS
] | [
BASELINE SYSTEM
]
[
SCHEDULE
] | [
AUTHORS
] | [
PAPERS
]
The results of the shared task are summarized in the paper:
Further Meta-Evaluation of Machine Translation
Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz and Josh Schroeder
pdf
ps
bib
Available for download here:
All system submissions and src-ref files
for all tasks/systems, in both plain text and XML (48MB)
System submissions and src-ref files
for all human-evaluated tasks/systems, in plain text only (21MB)
Human judgments
, comma delimited (199KB)
Constituent information and tokenized submissions and src-ref files
(23MB)
System-level human and automatic metric scores
for many of the submitted systems (16KB)
Format of the CSV judgment data columns:
Task, e.g.
WMT08 Czech-English News Commentary
Type (
RANK
or
CONSTITUENT
or
CONSTITUENT_ACCEPT
)
Item ID (sentence number or constituent number - see below for information on constituent numbers)
Annotator ID (numerical ID since anonymized)
Time spent on annotation (in seconds)
System judgments (up to 5):
System name, e.g.,
uedin
Score, e.g.
2
(see below for score definitions for each test type)
Additional score (blank this year, fluency for NIST test type in WMT 07)
Score column information:
RANK - 1 to 5, 1 is 'Best' rank, 5 is 'Worst'
CONSTITUENT - 1 to 5, 1 is 'Best' rank, 5 is 'Worst'
CONSTITUENT_ACCEPT - 1 to 3, 1 is 'Yes', 2 is 'No', 3 is 'Not Sure'
Item ID column information:
RANK - Item ID corresponds to the line in the test set, starting with 0.
CONSTITUENT and CONSTITUENT_ACCEPT - Item ID corresponds to the line in the constituent file for that task.
Constituent files are delimited with ' ||| ' and contain the following fields:
Sentence number (identical to RANK's Item ID)
POS information
src:
X
-
Y
- Word positions in src .tok file, starting with 0
ref:
X
-
Y
- Word positions for ref .tok file
system
:
X
-
Y
- same as for src and ref above, repeated for each system
System-Level Human and Automatic Rankings are white-space delimited with the following columns:
Metric:
Rank
,
Const
,
Yes/No
for human metrics, automatic metrics are named.
Language pair: e.g.
fr-en
Test set: either
test2008
or
newstest2008
System name
Score: larger nnumbers are always better (MTER has been reversed), but may be larger than 1.0.
System type: either
smt
,
rbmt
or
syscomb
.
supported by the
EuroMatrix
project, P6-IST-5-034291-STP
funded by the European Commission under Framework Programme 6