Shared Task: Triangular MT: Using English to improve Russian-to-Chinese machine translation

Task Description

Given a low-resource language pair (X/Y), the bulk of previous MT work has pursued one of two strategies.

However, there are many other possible strategies for combining such resources. These may involve, for example, ensemble methods, multi-source training methods, multi-target training methods, or novel data augmentation methods.

The goals of this shared task is to promote:

Task: Russian-to-Chinese machine translation

We provide three parallel corpora: 

We evaluate system translations on a (secret) mixed-genre test set, drawn from the web and curated for high quality segment pairs. After receiving test data, participants have one week to submit translations. After all submissions are received, we will post a populated leaderboard that will continue to receive post-evaluation submissions.

The evaluation metric for the shared task is 4-gram character Bleu.

The script to be used for Bleu computation is here (almost identical to that in Moses with a few minor differences). Instructions to run the script is in the baseline code that we released for the shared task. (link)

Participate

To participate please register to the shared task on Codalab .

Link to Codalab website.

Important Dates

Contacts

Chair: Ajay Nagesh (DiDi Labs, USA)
Email: ajaynagesh@didiglobal.com   

Organizers

  • Arkady Arkhangorodsky (DiDi Labs, USA)
  • Ajay Nagesh, Chair (DiDi Labs, USA)
  • Kevin Knight (DiDi Labs, USA)

Acknowledgments: 

Thanks to Didi Chuxing for providing data and research time to support this shared task.