Spanish-English NMT systems

Teacher and student models for Spanish.

Teachers

newstest2013, cased BLEU

system en-es es-en comment
Best WMT13 UEdin 30.4 Moses
Best WMT13 Kenneth 31.4 Moses
transformer-base 31.2 33.4
transformer-base, filtering 34.5 35.4
transformer-big, sBT 36.3 37.0
transformer-big, sBT+BT 36.5 -
transformer-big, sBT+FT - 36.5
- ensemble x2 (teacher) 36.5 37.0 small improvements on WMT12, TED13 & UNv1

Notes:

Students

Spanish-English

system size (MB) BLEU CPU (sec) GPU (sec)
teacher ensemble x2, beam 4 2x 798MB 36.9 -- 123s
student tiny11, beam 1 65MB 35.7 25s 3.6s
student tiny11, beam 1, packed8avx512 46MB 35.6 19s --
student tiny11, beam 1, intgemm8 17MB 35.2 17s --
student tiny11, beam 1, intgemm8alphas 17MB 35.3 16s --

English-Spanish

system size (MB) BLEU CPU (sec) GPU (sec)
teacher ensemble x2, beam 4 2x 798MB 36.5 -- 126s
student tiny11, beam 1 65MB 35.1 24s 3.7s
student tiny11, beam 1, packed8avx512 46MB 34.9 18s --
student tiny11, beam 1, intgemm8 17MB 34.8 16s --
student tiny11, beam 1, intgemm8alphas 17MB 35.0 15s --

Notes: