Shared Task: Large-Scale Machine Translation Evaluation for African Languages


Machine translation research has traditionally placed an outsized focus on a limited number of languages - mostly belonging to the Indoeuropean family. Progress for many languages, some with millions of speakers, has been held back by data scarcity issues. An inspiring recent trend has been the increased attention paid to low-resource languages. However, these modelling efforts have been hindered by the lack of high quality, standardised evaluation benchmarks.

For the second edition of the Large-Scale MT shared task, we aim to bring together the community on the topic of machine translation for a set of 24 African languages. We do so by introducing a high quality benchmark, paired with a fair and rigorous evaluation procedure.

Task Description

The shared task will consist of three tracks.


Full list of languages

Focus languages:
Afrikaans - afr Lingala - lin Swati - ssw
Amharic - amh Luganda - lug Tswana - tsn
Chichewa - nya Luo - luo Umbundu - umb
Nigerian Fulfulde - fuv Northern Sotho - nso Wolof - wol
Hausa - hau Oromo - orm Xhosa - xho
Igbo - ibo Shona - sna Xitsonga - tso
Kamba - kam Somali - som Yoruba - yor
Kinyarwanda - kin Swahili - swh Zulu - zul

Colonial linguae francae: English - eng, French - fra


>>> Submission instructions and details HERE <<<

Due to computational and budgetary constraints, manual and human evaluation will be conducted on a small set of language pairs from the FLORES-101 dataset. You can download it using this script . Specifically, we will evaluate on the following 100 language pairs:

Automatic Metrics: The systems will be evaluated on a suite of automatic metrics:

Participants are encouraged but not required to handle all language pairs. Submissions dealing with only a subset of pairs will be admissible.

Data track

The Data track focuses on the contribution of novel corpora. Participants may submit monolingual, bilingual or multilingual datasets relevant to the training of MT models for this year’s set of languages.

Data track: Submissions

Data track: Evaluation

Data track: Paper submission

Data track will require either the submission of an extended abstract [2-4] pages or a paragraph describing the dataset, together with the datasheet [example templates: [1] [2]]. Participants who submit datasets should ensure that data is correctly credited by giving attribution not only to the data collectors but also to the people from whom the data was originally collected.

The deadline for this submission is the same as system description papers.

Compute grants

In order to facilitate the work on low-resource translation and mitigate the cost of training and/or fine-tuning large models, we will be able to provide Microsoft Azure credits so that GPU compute is less of a barrier for translation research.

To apply for credits please fill in this brief form.

Important dates


Interested in the task? Please join the WMT google group for any further questions or comments.


Antonios Anastasopoulos, George Mason University.
Vukosi Marivate, University of Pretoria, Masakhane NLP, Deep Learning Indaba
David Adelani, Saarland University, Masakhane NLP
Marta R. Costa-jussà, Meta AI
Paco Guzmán, Meta AI
Jean Maillard, Meta AI
Safiyyah Saleem, Meta AI
Holger Schwenk, Meta AI
Natalia Fedorova, Toloka AI
Sergey Koshelev, Toloka AI
Akshita Bhagia, AI2
Jesse Dodge, AI2
Md Mahfuz ibn Alam, George Mason University
Jonathan Mbuya, George Mason University.
Fahim Faisal, George Mason University.