BERT Rediscovers the Classical NLP Pipeline
Tenney et al.: https://arxiv.org/abs/1905.05950
#artificialintelligence #bert #machinelearning #nlp
Tenney et al.: https://arxiv.org/abs/1905.05950
#artificialintelligence #bert #machinelearning #nlp
arXiv.org
BERT Rediscovers the Classical NLP Pipeline
Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the...
Introducing FastBert — A simple Deep Learning library for BERT Models
Blog by Kaushal Trivedi: https://medium.com/huggingface/introducing-fastbert-a-simple-deep-learning-library-for-bert-models-89ff763ad384
#MachineLearning #ArtificialIntelligence #NLP #Bert #NaturalLanguageProcessing
Blog by Kaushal Trivedi: https://medium.com/huggingface/introducing-fastbert-a-simple-deep-learning-library-for-bert-models-89ff763ad384
#MachineLearning #ArtificialIntelligence #NLP #Bert #NaturalLanguageProcessing
Medium
Introducing FastBert — A simple Deep Learning library for BERT Models
A simple to use Deep Learning library to build and deploy BERT models
Visualizing and Measuring the Geometry of BERT
Coenen, Reif, Yuan et al.: https://arxiv.org/pdf/1906.02715.pdf
#ArtificialIntelligence #DeepLearning #BERT #NLP
Coenen, Reif, Yuan et al.: https://arxiv.org/pdf/1906.02715.pdf
#ArtificialIntelligence #DeepLearning #BERT #NLP
Visualizing and Measuring the Geometry of BERT
Coenen et al.: https://arxiv.org/abs/1906.02715
#ArtificialIntelligence #BERT #NaturalLanguageProcessing
Coenen et al.: https://arxiv.org/abs/1906.02715
#ArtificialIntelligence #BERT #NaturalLanguageProcessing
arXiv.org
Visualizing and Measuring the Geometry of BERT
Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks...
What Does BERT Look At? An Analysis of BERT's Attention
Clark et al.: https://arxiv.org/abs/1906.04341
Code: https://github.com/clarkkev/attention-analysis
#bert #naturallanguage #unsupervisedlearning
Clark et al.: https://arxiv.org/abs/1906.04341
Code: https://github.com/clarkkev/attention-analysis
#bert #naturallanguage #unsupervisedlearning
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Liu et al.: https://arxiv.org/abs/1907.11692
#bert #naturallanguageprocessing #unsupervisedlearning
Liu et al.: https://arxiv.org/abs/1907.11692
#bert #naturallanguageprocessing #unsupervisedlearning
arXiv.org
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private...
Visualizing and Measuring the Geometry of BERT
Coenen et al.: https://arxiv.org/abs/1906.02715
#BERT #NaturalLanguageProcessing #UnsupervisedLearning
Coenen et al.: https://arxiv.org/abs/1906.02715
#BERT #NaturalLanguageProcessing #UnsupervisedLearning
arXiv.org
Visualizing and Measuring the Geometry of BERT
Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks...
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Medium
🏎 Smaller, faster, cheaper, lighter: Introducing DilBERT, a distilled version of BERT
You can find the code to reproduce the training of DilBERT along with pre-trained weights for DilBERT here.
Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Blog by Victor Sanh : https://medium.com/huggingface/distilbert-8cf3380435b5
#MachineLearning #NLP #Bert #Distillation #Transformers
Medium
🏎 Smaller, faster, cheaper, lighter: Introducing DilBERT, a distilled version of BERT
You can find the code to reproduce the training of DilBERT along with pre-trained weights for DilBERT here.
Extreme Language Model Compression with Optimal Subwords and Shared Projections
Zhao et al.: https://arxiv.org/abs/1909.11687
#neuralnetwork #bert #nlp
Zhao et al.: https://arxiv.org/abs/1909.11687
#neuralnetwork #bert #nlp
arXiv.org
Extremely Small BERT Models from Mixed-Vocabulary Training
Pretrained language models like BERT have achieved good results on NLP tasks, but are impractical on resource-limited devices due to memory footprint. A large fraction of this footprint comes from...
The Illustrated GPT-2 (Visualizing Transformer Language Models)
Blog by Jay Alammar : https://jalammar.github.io/illustrated-gpt2/
#BERT #Transformer #ArtificialIntelligence
Blog by Jay Alammar : https://jalammar.github.io/illustrated-gpt2/
#BERT #Transformer #ArtificialIntelligence
jalammar.github.io
The Illustrated GPT-2 (Visualizing Transformer Language Models)
Discussions:
Hacker News (64 points, 3 comments), Reddit r/MachineLearning (219 points, 18 comments)
Translations: Simplified Chinese, French, Korean, Russian, Turkish
This year, we saw a dazzling application of machine learning. The OpenAI GPT…
Hacker News (64 points, 3 comments), Reddit r/MachineLearning (219 points, 18 comments)
Translations: Simplified Chinese, French, Korean, Russian, Turkish
This year, we saw a dazzling application of machine learning. The OpenAI GPT…
exBERT- A Visual Analysis Tool to Explore Learned Representations in Transformers Models
Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann : http://exbert.net
#NLP #BERT #LanguageModel
Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann : http://exbert.net
#NLP #BERT #LanguageModel