Data Science by ODS.ai π¦
Should we create official chat for the channel to discuss links, answer common questions and to flood (during nighttime) ?
We count every opinion and listen to your feedback, so please vote.
We also preparing special event for the chat creation, so stay tuned for the announcement
We also preparing special event for the chat creation, so stay tuned for the announcement
Microsoft open-sourced scripts and notebooks to pre-train and finetune BERT natural language model with domain-specific texts
Github: https://github.com/microsoft/AzureML-BERT
Earlier: https://t.me/opendatascience/837
#Bert #Microsoft #NLP #dl
Github: https://github.com/microsoft/AzureML-BERT
Earlier: https://t.me/opendatascience/837
#Bert #Microsoft #NLP #dl
GitHub
GitHub - microsoft/AzureML-BERT: End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service - microsoft/AzureML-BERT
Forwarded from Karim Iskakov - ΠΊΠ°Π½Π°Π» (Vladimir Ivashkin)
This media is not supported in your browser
VIEW IN TELEGRAM
I'd like to present our new paper with Yandex.Weather! We are pioneers in using a combination of satellite images, radar shots and neural networks for real-time rain forecast. Check out our video for more details!
βΆοΈ youtu.be/9zd3VR-prYU
π yandex.com/weather/nowcast
π arxiv.org/abs/1905.09932
π @loss_function_porn
βΆοΈ youtu.be/9zd3VR-prYU
π yandex.com/weather/nowcast
π arxiv.org/abs/1905.09932
π @loss_function_porn
ODS breakfast in Paris! See you this Saturday at 10:30 at Malongo CafΓ©, 50 Rue Saint-AndrΓ© des Arts.
ββTabNine showed deep learning code autocomplete tool based on GPT-2 architecture.
Video demonstrates the concept. Hopefully, it will allow us to write code with less bugs, not more.
Link: https://tabnine.com/blog/deep
Something relatively similar by Microsoft: https://visualstudio.microsoft.com/ru/services/intellicode
#GPT2 #TabNine #autocomplete #product #NLP #NLU #codegeneration
Video demonstrates the concept. Hopefully, it will allow us to write code with less bugs, not more.
Link: https://tabnine.com/blog/deep
Something relatively similar by Microsoft: https://visualstudio.microsoft.com/ru/services/intellicode
#GPT2 #TabNine #autocomplete #product #NLP #NLU #codegeneration
Great collection of practical rules for routine DS engineering / research job.
Machine Learning in a company is 10% Data Science & 90% other challenges, this pdf provides a great deal of principals and solutions to deal with them.
We can only recommend saving this post to your Saved Messages by forwarding it to yourself.
Link: http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
#cheatsheet #advice #practical #common #shouldbesaved
Machine Learning in a company is 10% Data Science & 90% other challenges, this pdf provides a great deal of principals and solutions to deal with them.
We can only recommend saving this post to your Saved Messages by forwarding it to yourself.
Link: http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
#cheatsheet #advice #practical #common #shouldbesaved
ββYouTokenToMe, new tool for text tokenisation from VK team
Meet new enhanced tokenisation tool on steroids. Works 7-10 times faster alphabetic languages and 40 to 50 times faster on logographic languages, than alternatives.
Under the hood (watch source) there is C++ implementation with python bindings, using Byte Pair Encoding (BPE) algorithm. YouTokenToMe beats #SentencePiece by Google and #fastBPE, created by a researcher from Facebook AI Research in terms of speed.
Github: https://github.com/vkcom/YouTokenToMe
Medium: https://medium.com/@vktech/youtokentome-a-tool-for-quick-text-tokenization-from-the-vk-team-aa6341215c5a
Byte Pair Encoding: https://arxiv.org/abs/1508.07909
Meet new enhanced tokenisation tool on steroids. Works 7-10 times faster alphabetic languages and 40 to 50 times faster on logographic languages, than alternatives.
Under the hood (watch source) there is C++ implementation with python bindings, using Byte Pair Encoding (BPE) algorithm. YouTokenToMe beats #SentencePiece by Google and #fastBPE, created by a researcher from Facebook AI Research in terms of speed.
Github: https://github.com/vkcom/YouTokenToMe
Medium: https://medium.com/@vktech/youtokentome-a-tool-for-quick-text-tokenization-from-the-vk-team-aa6341215c5a
Byte Pair Encoding: https://arxiv.org/abs/1508.07909
Data Science by ODS.ai π¦
ββYouTokenToMe, new tool for text tokenisation from VK team Meet new enhanced tokenisation tool on steroids. Works 7-10 times faster alphabetic languages and 40 to 50 times faster on logographic languages, than alternatives. Under the hood (watch source)β¦
This improvement for everyday used toolset deserves minimum 50 claps at Medium and a Star on github!
Letβs give VK research team appreciation from the community they deserve π!!
Letβs give VK research team appreciation from the community they deserve π!!
Whatβs wrong with transformer architecture: an overview
How the Transformers broke NLP leaderboards and why that can be bad for industry.
Link: https://hackingsemantics.xyz/2019/leaderboards/
#NLP #overview #transformer #BERT #XLNet
How the Transformers broke NLP leaderboards and why that can be bad for industry.
Link: https://hackingsemantics.xyz/2019/leaderboards/
#NLP #overview #transformer #BERT #XLNet
Hacking semantics
How the Transformers broke NLP leaderboards
With the huge Transformer-based models such as BERT, GPT-2, and XLNet, are we losing track of how the state-of-the-art performance is achieved?
ββSimultaneous food and facial recognition at a Foxconn factory canteen, Shenzhen China
#video #foodlearning #facerecogniction #dl #cv #foxconn
#video #foodlearning #facerecogniction #dl #cv #foxconn
ββDeep Learning Image Segmentation for Ecommerce Catalogue Visual Search
Microsoftβs article on image segmentation
Link: https://www.microsoft.com/developerblog/2018/04/18/deep-learning-image-segmentation-for-ecommerce-catalogue-visual-search/
#CV #DL #Segmentation #Microsoft
Microsoftβs article on image segmentation
Link: https://www.microsoft.com/developerblog/2018/04/18/deep-learning-image-segmentation-for-ecommerce-catalogue-visual-search/
#CV #DL #Segmentation #Microsoft
ββGoogle AI research on learning better simulation methods for partial differential equations
New research shows how machine learning can improve high-performance computing for solving partial differential equations, with potential applications that range from modeling #climatechange to simulating fusion reactions. Learn all about it here
Link: https://ai.googleblog.com/2019/07/learning-better-simulation-methods-for.html
#PDE #DE #GoogleAI
New research shows how machine learning can improve high-performance computing for solving partial differential equations, with potential applications that range from modeling #climatechange to simulating fusion reactions. Learn all about it here
Link: https://ai.googleblog.com/2019/07/learning-better-simulation-methods-for.html
#PDE #DE #GoogleAI
On the concept of 'intellectual debt'
There is technical debt β when you know you should rewrite some stuff, or implement some features, but they don't seem critical at the moment. So article introduces a concept of 'intellectual debt', which resies with more broad and common use of #MachineLearning and #DeepLearning (specially, the latter). What happens when AI gives us seemingly correct answers that we wouldn't have thought of ourselves, without any theory to explain them?
Link: https://www.newyorker.com/tech/annals-of-technology/the-hidden-costs-of-automated-thinking
#Meta #common #lyrics
There is technical debt β when you know you should rewrite some stuff, or implement some features, but they don't seem critical at the moment. So article introduces a concept of 'intellectual debt', which resies with more broad and common use of #MachineLearning and #DeepLearning (specially, the latter). What happens when AI gives us seemingly correct answers that we wouldn't have thought of ourselves, without any theory to explain them?
Link: https://www.newyorker.com/tech/annals-of-technology/the-hidden-costs-of-automated-thinking
#Meta #common #lyrics
The New Yorker
The Hidden Costs of Automated Thinking
Overreliance on artificial intelligence may put us in intellectual debt.
ββNew dataset with adversarial examples
Natural Adversarial Examples are real-world and unmodified examples which cause classifiers to be consistently confused. The new dataset has 7,500 images, which we personally labeled over several months.
ArXiV: https://arxiv.org/abs/1907.07174
Dataset and code: https://github.com/hendrycks/natural-adv-examples
#Dataset #Adversarial
Natural Adversarial Examples are real-world and unmodified examples which cause classifiers to be consistently confused. The new dataset has 7,500 images, which we personally labeled over several months.
ArXiV: https://arxiv.org/abs/1907.07174
Dataset and code: https://github.com/hendrycks/natural-adv-examples
#Dataset #Adversarial
ββRelease of 27 pretrained models for NLP / NLU for PyTorch
Hugging Face open sources a new library that contains up to 27 pretrained models to conduct state-of-the-art NLP/NLU tasks.
Link: https://medium.com/dair-ai/pytorch-transformers-for-state-of-the-art-nlp-3348911ffa5b
#SOTA #NLP #NLU #PyTorch #opensource
Hugging Face open sources a new library that contains up to 27 pretrained models to conduct state-of-the-art NLP/NLU tasks.
Link: https://medium.com/dair-ai/pytorch-transformers-for-state-of-the-art-nlp-3348911ffa5b
#SOTA #NLP #NLU #PyTorch #opensource
ODS breakfast in Paris! See you this Saturday at 10:30 at Malongo CafΓ©, 50 Rue Saint-AndrΓ© des Arts.
Filter autoselect in VSCO by Google
#VSCO used #TensorFlow Lite to develop the 'For This Photo' feature, which uses on-device ML to suggest photo filter presets from a curated list.
YouTube: https://www.youtube.com/watch?v=fHbjfeitIvE
Link: https://medium.com/tensorflow/suggesting-presets-for-images-building-for-this-photo-at-vsco-9b94041c4ba4
#mobile #device #cv #dl
#VSCO used #TensorFlow Lite to develop the 'For This Photo' feature, which uses on-device ML to suggest photo filter presets from a curated list.
YouTube: https://www.youtube.com/watch?v=fHbjfeitIvE
Link: https://medium.com/tensorflow/suggesting-presets-for-images-building-for-this-photo-at-vsco-9b94041c4ba4
#mobile #device #cv #dl
YouTube
VSCO β For This Photo
Baidu's recent paper: Hubless Nearest Neighbor Search
Hubless Nearest Neighbor Search, a new method for Bilingual Lexicon Induction, improves retrieval accuracy significantly. Empirical results show HNN outperforms NN, ISF and other state-of-the-art.
Github: https://github.com/baidu-research/HNN
Paper: https://github.com/baidu-research/HNN/blob/master/doc/HNN.pdf
#ACL2019 #NLP #NLU
Hubless Nearest Neighbor Search, a new method for Bilingual Lexicon Induction, improves retrieval accuracy significantly. Empirical results show HNN outperforms NN, ISF and other state-of-the-art.
Github: https://github.com/baidu-research/HNN
Paper: https://github.com/baidu-research/HNN/blob/master/doc/HNN.pdf
#ACL2019 #NLP #NLU
GitHub
baidu-research/HNN
Contribute to baidu-research/HNN development by creating an account on GitHub.
ββPlato Research Dialogue System: A Flexible Conversational AI Platform
The Plato Research Dialogue System is a platform #Uber developed to enable experts and non-experts alike to quickly build, train, and deploy conversational AI agents.
Link: https://eng.uber.com/plato-research-dialogue-system/
#ConversationalAI #converstaion #NLP #NLU
The Plato Research Dialogue System is a platform #Uber developed to enable experts and non-experts alike to quickly build, train, and deploy conversational AI agents.
Link: https://eng.uber.com/plato-research-dialogue-system/
#ConversationalAI #converstaion #NLP #NLU