Forwarded from Tensorflow(@CVision) (Alireza Akhavan)
#مقاله
The Evolved Transformer
The Evolved Transformer: They perform architecture search on Transformer's stackable cells for seq2seq tasks. “A much smaller, mobile-friendly, Evolved Transformer with only ~7M parameters outperforms the original Transformer by 0.7 BLEU on WMT14 EN-DE.”
https://arxiv.org/abs/1901.11117
The Evolved Transformer is twice as efficient as the Transformer in FLOPS without loss in quality.
#seq2seq
The Evolved Transformer
The Evolved Transformer: They perform architecture search on Transformer's stackable cells for seq2seq tasks. “A much smaller, mobile-friendly, Evolved Transformer with only ~7M parameters outperforms the original Transformer by 0.7 BLEU on WMT14 EN-DE.”
https://arxiv.org/abs/1901.11117
The Evolved Transformer is twice as efficient as the Transformer in FLOPS without loss in quality.
#seq2seq