Transformers from scratch
Modern transformers are super simple, so they can be explained in a really straightforward manner
Blog by Peter Bloem, with pytorch code : http://peterbloem.nl/blog/transformers
#MachineLearning #PyTorch #Transformers
Modern transformers are super simple, so they can be explained in a really straightforward manner
Blog by Peter Bloem, with pytorch code : http://peterbloem.nl/blog/transformers
#MachineLearning #PyTorch #Transformers
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Medium
🏎 Smaller, faster, cheaper, lighter: Introducing DilBERT, a distilled version of BERT
You can find the code to reproduce the training of DilBERT along with pre-trained weights for DilBERT here.
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Medium
🏎 Smaller, faster, cheaper, lighter: Introducing DilBERT, a distilled version of BERT
You can find the code to reproduce the training of DilBERT along with pre-trained weights for DilBERT here.
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita, Rico Sennrich, Ivan Titov
Blog: https://lena-voita.github.io/posts/emnlp19_evolution.html
Paper: https://arxiv.org/abs/1909.01380
#ArtificialIntelligence #MachineLearning #Transformers
Elena Voita, Rico Sennrich, Ivan Titov
Blog: https://lena-voita.github.io/posts/emnlp19_evolution.html
Paper: https://arxiv.org/abs/1909.01380
#ArtificialIntelligence #MachineLearning #Transformers
"Hierarchical Reinforcement Learning for Open-Domain Dialog"
Abdelrhman Saleh, Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Rosalind Picard : https://arxiv.org/abs/1909.07547
Code: https://github.com/natashamjaques/neural_chat
Bots! https://neural.chat/vhrl_techniques/
#MachineLearning #ReinforcementLearning #Transformers
Abdelrhman Saleh, Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Rosalind Picard : https://arxiv.org/abs/1909.07547
Code: https://github.com/natashamjaques/neural_chat
Bots! https://neural.chat/vhrl_techniques/
#MachineLearning #ReinforcementLearning #Transformers
Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch
By 🤗 Hugging Face : https://huggingface.co/transformers
#Transformers #MachineLearning #NLP
By 🤗 Hugging Face : https://huggingface.co/transformers
#Transformers #MachineLearning #NLP
Attention? Attention!
Blog by Lilian Weng : https://lilianweng.github.io/lil-log/2018/06/24/attention-attention.html
#machinelearning #neuralnetwork #transformers
Blog by Lilian Weng : https://lilianweng.github.io/lil-log/2018/06/24/attention-attention.html
#machinelearning #neuralnetwork #transformers
Lil'Log
Attention Attention
Transformers: State-of-the-art Natural Language Processing
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Jamie Brew : https://arxiv.org/abs/1910.03771
#Transformers #NaturalLanguageProcessing #PyTorch #TensorFlow
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Jamie Brew : https://arxiv.org/abs/1910.03771
#Transformers #NaturalLanguageProcessing #PyTorch #TensorFlow
Stabilizing Transformers for Reinforcement Learning
Parisotto et al.: https://arxiv.org/abs/1910.06764
#DeepLearning #Transformers #ReinforcementLearning
Parisotto et al.: https://arxiv.org/abs/1910.06764
#DeepLearning #Transformers #ReinforcementLearning
arXiv.org
Stabilizing Transformers for Reinforcement Learning
Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success...
Language Models as Knowledge Bases?
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
arXiv.org
Language Models as Knowledge Bases?
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be...
A tutorial to implement state-of-the-art NLP models with Fastai for Sentiment Analysis
Maximilien Roberti : https://towardsdatascience.com/fastai-with-transformers-bert-roberta-xlnet-xlm-distilbert-4f41ee18ecb2
#FastAI #NLP #Transformers
Maximilien Roberti : https://towardsdatascience.com/fastai-with-transformers-bert-roberta-xlnet-xlm-distilbert-4f41ee18ecb2
#FastAI #NLP #Transformers
Language Models as Knowledge Bases?
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
arXiv.org
Language Models as Knowledge Bases?
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be...
On the Relationship between Self-Attention and Convolutional Layers
Jean-Baptiste Cordonnier, Andreas Loukas, Martin Jaggi: https://openreview.net/forum?id=HJlnC1rKPB
#ArtificialIntelligence #DeepLearning #Transformers
Jean-Baptiste Cordonnier, Andreas Loukas, Martin Jaggi: https://openreview.net/forum?id=HJlnC1rKPB
#ArtificialIntelligence #DeepLearning #Transformers
OpenReview
On the Relationship between Self-Attention and Convolutional Layers
A self-attention layer can perform convolution and often learns to do so in practice.
Language Models as Knowledge Bases?
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
Petroni et al.: https://arxiv.org/abs/1909.01066
#Transformers #NaturalLanguageProcessing #MachineLearning
End-to-End Object Detection with Transformers
Carion et al.: https://arxiv.org/abs/2005.12872
Colab: https://colab.research.google.com/drive/1rPm0-UrWHpJJRX9PsNb5SpzZiUlMh7wm
#ArtificialIntelligence #DeepLearning #Transformers
Carion et al.: https://arxiv.org/abs/2005.12872
Colab: https://colab.research.google.com/drive/1rPm0-UrWHpJJRX9PsNb5SpzZiUlMh7wm
#ArtificialIntelligence #DeepLearning #Transformers
Google
detr_demo.ipynb
Colaboratory notebook