Data Science by ODS.ai 🦜
51.1K subscribers
359 photos
32 videos
7 files
1.51K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp
Download Telegram
​​TabNine showed deep learning code autocomplete tool based on GPT-2 architecture.

Video demonstrates the concept. Hopefully, it will allow us to write code with less bugs, not more.

Link: https://tabnine.com/blog/deep
Something relatively similar by Microsoft: https://visualstudio.microsoft.com/ru/services/intellicode

#GPT2 #TabNine #autocomplete #product #NLP #NLU #codegeneration
​​Program Synthesis with Large Language Models

Paper compares models used for program synthesis in general purpose programming languages against two new benchmarks, MBPP (The Mostly Basic Programming Problems) and MathQA-Python, in both the few-shot and fine-tuning regimes.

MBPP contains 974 programming tasks, designed to be solvable by entry-level programmers. MathQA benchmark, contains 23914 problems that evaluate the ability of the models to synthesize code from more complex text.

Largest fine-tuned model achieves 83.8 percent accuracy on the latter benchmark.

Why this is interesting: better models for code / problem understanding means improved search for the coding tasks and the improvement of the coding-assistant projects like #TabNine or #Copilot

ArXiV: https://arxiv.org/abs/2108.07732

#DL #NLU #codewritingcode #benchmark