Do Better ImageNet Models Transfer Better?
Finding: better ImageNet architectures tend to work better on other datasets too. Surprise: pretraining on ImageNet dataset sometimes doesn't help very much.
ArXiV: https://arxiv.org/abs/1805.08974
#ImageNet #finetuning #transferlearning
Finding: better ImageNet architectures tend to work better on other datasets too. Surprise: pretraining on ImageNet dataset sometimes doesn't help very much.
ArXiV: https://arxiv.org/abs/1805.08974
#ImageNet #finetuning #transferlearning
⭐️Fine-Tuning GPT-2 from Human Preferences
#OpenAI team fine-tuned 774M parameters model to achieve better scores in #summarization and stylistic text continuation in terms of human understanding.
Article definately worths reading (approx 15 min.) with Challenges and lessons learned section and examples.
Link: https://openai.com/blog/fine-tuning-gpt-2/
Paper: https://arxiv.org/abs/1909.08593
Code: https://github.com/openai/lm-human-preferences
#NLP #NLU #finetuning
#OpenAI team fine-tuned 774M parameters model to achieve better scores in #summarization and stylistic text continuation in terms of human understanding.
Article definately worths reading (approx 15 min.) with Challenges and lessons learned section and examples.
Link: https://openai.com/blog/fine-tuning-gpt-2/
Paper: https://arxiv.org/abs/1909.08593
Code: https://github.com/openai/lm-human-preferences
#NLP #NLU #finetuning
Openai
Fine-tuning GPT-2 from human preferences
We’ve fine-tuned the 774M parameter GPT-2 language model using human feedback for various tasks, successfully matching the preferences of the external human labelers, though those preferences did not always match our own. Specifically, for summarization tasks…
Simple, Scalable Adaptation for Neural Machine Translation
Fine-tuning pre-trained Neural Machine Translation (NMT) models is the dominant approach for adapting to new languages and domains. However, fine-tuning requires adapting and maintaining a separate model for each target task. Researchers from Google propose a simple yet efficient approach for adaptation in #NMT. Their proposed approach consists of injecting tiny task specific adapter layers into a pre-trained model. These lightweight adapters, with just a small fraction of the original model size, adapt the model to multiple individual tasks simultaneously.
Guess it can be applied not only in #NMT but in many other #NLP, #NLU and #NLG tasks.
Paper: https://arxiv.org/pdf/1909.08478.pdf
#BERT #NMT #FineTuning
Fine-tuning pre-trained Neural Machine Translation (NMT) models is the dominant approach for adapting to new languages and domains. However, fine-tuning requires adapting and maintaining a separate model for each target task. Researchers from Google propose a simple yet efficient approach for adaptation in #NMT. Their proposed approach consists of injecting tiny task specific adapter layers into a pre-trained model. These lightweight adapters, with just a small fraction of the original model size, adapt the model to multiple individual tasks simultaneously.
Guess it can be applied not only in #NMT but in many other #NLP, #NLU and #NLG tasks.
Paper: https://arxiv.org/pdf/1909.08478.pdf
#BERT #NMT #FineTuning
Efficient multi-lingual language model fine-tuning
Most of the world’s text is not in English. To enable researchers and practitioners to build impactful solutions in their domains, understanding how our NLP architectures fare in many languages needs to be more than an afterthought.
In this post, we introduce our latest paper that studies multilingual text classification and introduces #MultiFiT, a novel method based on #ULMFiT.
MultiFiT, trained on 100 labeled documents in the target language, outperforms multi-lingual BERT. It also outperforms the cutting-edge LASER algorithm-even though LASER requires a corpus of parallel texts, and MultiFiT does not.
Post: http://nlp.fast.ai/classification/2019/09/10/multifit.html…
Paper: https://arxiv.org/abs/1909.04761
Tweet: https://twitter.com/seb_ruder/status/1186744388908654597?s=20
#NLP #DL #FineTuning
Most of the world’s text is not in English. To enable researchers and practitioners to build impactful solutions in their domains, understanding how our NLP architectures fare in many languages needs to be more than an afterthought.
In this post, we introduce our latest paper that studies multilingual text classification and introduces #MultiFiT, a novel method based on #ULMFiT.
MultiFiT, trained on 100 labeled documents in the target language, outperforms multi-lingual BERT. It also outperforms the cutting-edge LASER algorithm-even though LASER requires a corpus of parallel texts, and MultiFiT does not.
Post: http://nlp.fast.ai/classification/2019/09/10/multifit.html…
Paper: https://arxiv.org/abs/1909.04761
Tweet: https://twitter.com/seb_ruder/status/1186744388908654597?s=20
#NLP #DL #FineTuning