Forwarded from Dev events
🔴 Can VS Code teach you TypeScript?
Thursday, January 26, 2023
8:00 PM to 9:30 PM GET
Hosted By Visual Studio Code T.
Details
TypeScript Wizard Matt Pocock breaks down his new VS Code extension - the Total TypeScript VS Code extension - and puts GitHub #CoPilot through its paces generating and explaining TypeScript code.
Community Links:
https://totaltypescript.com/
https://aka.ms/TotalTypeScript
Thursday, January 26, 2023
8:00 PM to 9:30 PM GET
Hosted By Visual Studio Code T.
Details
TypeScript Wizard Matt Pocock breaks down his new VS Code extension - the Total TypeScript VS Code extension - and puts GitHub #CoPilot through its paces generating and explaining TypeScript code.
Community Links:
https://totaltypescript.com/
https://aka.ms/TotalTypeScript
Forwarded from Data engineering events (Николай Крупий)
Telegram
эйай ньюз
Ещё, кстати, данных для обучения всяких Copilot-ов прибавилось.
Forwarded from Neural Shit
Заметное снижение трафика на StackOverflow с момента запуска ChatGPT 🌚.
Действительно, зачем спрашивать совета у кожаных, когда лучше спрашивать у железных. Осталось ChatGPT научить отвечать в стиле "У тебя руки кривые, а ты сам еблан, потому твой код и не работает" и тогда StackOverflow и ЛОР можно полностью закрывать за ненадобностью.
Действительно, зачем спрашивать совета у кожаных, когда лучше спрашивать у железных. Осталось ChatGPT научить отвечать в стиле "У тебя руки кривые, а ты сам еблан, потому твой код и не работает" и тогда StackOverflow и ЛОР можно полностью закрывать за ненадобностью.
Forwarded from Neural Shit
Кстати, там гугл открыл пререгу для тестов LaMDA (эдакий аналог ChatGPT). Попытать удачу можно тут (не, ну вдруг повезет и кому-то таки удастся пощупать что у них там)
ИИ-помощник программиста #Copilot не ворует чужой код, заявили Microsoft, GitHub и OpenAI
https://3dnews.ru/1081113/microsoft-github-i-openai-poprosili-otklonit-isk-o-narushenii-avtorskih-prav-iimodelyu-copilot
https://3dnews.ru/1081113/microsoft-github-i-openai-poprosili-otklonit-isk-o-narushenii-avtorskih-prav-iimodelyu-copilot
3DNews - Daily Digital Digest
ИИ-помощник программиста Copilot не ворует чужой код, заявили Microsoft, GitHub и OpenAI
Microsoft, GitHub и OpenAI обратились с ходатайством об отклонении коллективного иска, в котором говорится, что при обучении алгоритма искусственного интеллекта (ИИ) GitHub Copilot нарушались авторские права на программный код.
This media is not supported in your browser
VIEW IN TELEGRAM
Нейросеть ChatGPT подключили к фреймворку React — это позволило собирать приложение «на лету», просто печатая словами то, что вам нужно «накодить».
https://t.me/russianchatbi/21914
https://t.me/russianchatbi/21914
Forwarded from Бэкдор
Нейронки для программирования. Эти сервисы помогут кодить более комфортно, а где-то даже исправят ваши ошибки и чему-то научат. Не лезть же на StackOverflow при каждом затыке.
Adrenaline — нашумевшая тулза, которая вылечит код и подробно расскажет об ошибках. Ссылка тут.
Tabnine — предсказывает следующие строки кода и дописывает их за вас. Поддерживает все популярные языки. Ссылка тут.
CodePal — помощник, который поможет писать код по текстовому запросу, оптимизировать, находить баги и ревьювить код. Ссылка тут.
Code GPT — решение на основе нейронки от OpenAi, Втыкается прямо в VSCode и сочиняет вам код по текстовому запросу. Ссылка тут.
Autobackend — поможет с бэкендом. Сервису достаточно одного-двух предложений на английском. Ссылка тут.
Codesnippets — генерит код из текстовых запросов. Есть отладка, рефакторинг и сохранение кода для остальной команды. Сервис платный, но есть бесплатная версия. Ссылка тут.
Buildt AI — поисковик для VSCode, который ищет готовый код по общедоступным базам данных. Ссылка тут.
@whackdoor
Adrenaline — нашумевшая тулза, которая вылечит код и подробно расскажет об ошибках. Ссылка тут.
Tabnine — предсказывает следующие строки кода и дописывает их за вас. Поддерживает все популярные языки. Ссылка тут.
CodePal — помощник, который поможет писать код по текстовому запросу, оптимизировать, находить баги и ревьювить код. Ссылка тут.
Code GPT — решение на основе нейронки от OpenAi, Втыкается прямо в VSCode и сочиняет вам код по текстовому запросу. Ссылка тут.
Autobackend — поможет с бэкендом. Сервису достаточно одного-двух предложений на английском. Ссылка тут.
Codesnippets — генерит код из текстовых запросов. Есть отладка, рефакторинг и сохранение кода для остальной команды. Сервис платный, но есть бесплатная версия. Ссылка тут.
Buildt AI — поисковик для VSCode, который ищет готовый код по общедоступным базам данных. Ссылка тут.
@whackdoor
Forwarded from Николай Крупий
📽 Если ты только начинаешь знакомиться с компьютерным зрением в Python, то это видео для тебя. Здесь расскажут и покажут, как начать работу с OpenCV
#info
Data Secrets
#info
Data Secrets
YouTube
OpenAI Codex Api - Generate Python Code with AI
#python #openai #codex
OpenAI Codex Api - Generate Python Code with AI
😀 video stamps 👇🏾👇🏾 Use to Jump ahead the video 😀
With the release of OpenAI Coders, OpenAI has given anyone the ability to write and train their own algorithms. In this video, we take…
OpenAI Codex Api - Generate Python Code with AI
😀 video stamps 👇🏾👇🏾 Use to Jump ahead the video 😀
With the release of OpenAI Coders, OpenAI has given anyone the ability to write and train their own algorithms. In this video, we take…
Forwarded from Николай Крупий
🖼 💥Не только Copilot: Codex от OpenAI
OpenAI выпустил бета-релиз модели Codex на базе GPT-3, которые могут понимать и генерировать код. Их обучающие данные содержат как естественный язык, так и миллиарды строк общедоступного кода из GitHub. Эти нейросетевые модели лучше всего разбираются в Python и владеют более чем дюжиной языков программирования, включая JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL и даже Shell. Codex пригодится в следующих задачах:
• Сделать код из комментариев
• Завершить следующую строку или функцию в контексте
• Найти полезную библиотеку или вызов API
• Добавить комментарии
• Сделать рефакторинг кода для повышения его эффективности
https://beta.openai.com/docs/guides/code
#Copilot #Codex #OpenAI
OpenAI выпустил бета-релиз модели Codex на базе GPT-3, которые могут понимать и генерировать код. Их обучающие данные содержат как естественный язык, так и миллиарды строк общедоступного кода из GitHub. Эти нейросетевые модели лучше всего разбираются в Python и владеют более чем дюжиной языков программирования, включая JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL и даже Shell. Codex пригодится в следующих задачах:
• Сделать код из комментариев
• Завершить следующую строку или функцию в контексте
• Найти полезную библиотеку или вызов API
• Добавить комментарии
• Сделать рефакторинг кода для повышения его эффективности
https://beta.openai.com/docs/guides/code
#Copilot #Codex #OpenAI
Forwarded from Николай Крупий
Создайте замену самому себе или ChatGPT с нуля
Туториал по тому, как написать ChatGPT от Андрея Карпати. С нуля. Этому видео вообще нужна реклама? Бежим и смотрим, ставим лайк и всё причитающееся.
P.S. Для написания он местами использует GitHub Copilot. GPT пишет GPT с помощью человека, кажется мы ходим по очень тонкому льду.
Посмотреть:
https://www.youtube.com/watch?v=kCc8FmEb1nY
#программирование
Туториал по тому, как написать ChatGPT от Андрея Карпати. С нуля. Этому видео вообще нужна реклама? Бежим и смотрим, ставим лайк и всё причитающееся.
P.S. Для написания он местами использует GitHub Copilot. GPT пишет GPT с помощью человека, кажется мы ходим по очень тонкому льду.
Посмотреть:
https://www.youtube.com/watch?v=kCc8FmEb1nY
#программирование
YouTube
Let's build GPT: from scratch, in code, spelled out.
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write…
Forwarded from Николай Крупий
https://www.youtube.com/watch?v=MUOVnIbTZeA
AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape.
🤑
Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI.
Click to expand
00:00
🤖
This technology can write code, rap songs, and papers, eliminating the need for copywriters.
02:10
🤖
AI chatbots could replace search engines, while cheating on exams is a result of the school system valuing grades over learning.
03:44
🤔
Form opinions based on facts, not guesses.
05:49
🤖
AI can help, but it can't replace the creativity of humans.
07:23
🤖
Machines are taking over tasks that used to be done by humans, but AI still cannot create something distinct that has never been seen before.
09:43
🤔
The job market has changed drastically in the past 40 years, with new jobs and increased competition, so it's important to stay informed and adapt.
12:55
🤔
Adapt to the changing job market and focus on creating human jobs, not UBI, to combat AI taking over roles.
15:35
AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape.
🤑
Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI.
Click to expand
00:00
🤖
This technology can write code, rap songs, and papers, eliminating the need for copywriters.
02:10
🤖
AI chatbots could replace search engines, while cheating on exams is a result of the school system valuing grades over learning.
03:44
🤔
Form opinions based on facts, not guesses.
05:49
🤖
AI can help, but it can't replace the creativity of humans.
07:23
🤖
Machines are taking over tasks that used to be done by humans, but AI still cannot create something distinct that has never been seen before.
09:43
🤔
The job market has changed drastically in the past 40 years, with new jobs and increased competition, so it's important to stay informed and adapt.
12:55
🤔
Adapt to the changing job market and focus on creating human jobs, not UBI, to combat AI taking over roles.
15:35
YouTube
“Silicone Valley People Will Lose Their Jobs!” - Reaction To OpenAI Being A $29 Billion Company
FaceTime or Ask Patrick any questions on https://minnect.com/.
Want to get clear on your next 5 business moves? https://valuetainment.com/academy/
In this short clip, Patrick Bet-David, Neil Tyson and Adam Sosnick talk about the consequences of open AI.…
Want to get clear on your next 5 business moves? https://valuetainment.com/academy/
In this short clip, Patrick Bet-David, Neil Tyson and Adam Sosnick talk about the consequences of open AI.…
Николай Крупий
https://www.youtube.com/watch?v=MUOVnIbTZeA AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape. 🤑 Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI. Click to expand…
YouTube (0bce05e41c2049f2af336ca18543923a)
(11007) “Silicone Valley People Will Lose Their Jobs!” - Reaction…
“Silicone Valley People Will Lose Their Jobs!” - Reaction To #OpenAI Being A $29 Billion Company
https://www.youtube.com/watch?v=MUOVnIbTZeA
AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape.
🤑
Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI.
Click to expand
00:00
🤖
This technology can write code, rap songs, and papers, eliminating the need for copywriters.
02:10
🤖
AI chatbots could replace search engines, while cheating on exams is a result of the school system valuing grades over learning.
03:44
🤔
Form opinions based on facts, not guesses.
05:49
🤖
AI can help, but it can't replace the creativity of humans.
07:23
🤖
Machines are taking over tasks that used to be done by humans, but AI still cannot create something distinct that has never been seen before.
09:43
🤔
The job market has changed drastically in the past 40 years, with new jobs and increased competition, so it's important to stay informed and adapt.
12:55
🤔
Adapt to the changing job market and focus on creating human jobs, not UBI, to combat AI taking over roles.
15:35
https://www.youtube.com/watch?v=MUOVnIbTZeA
AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape.
🤑
Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI.
Click to expand
00:00
🤖
This technology can write code, rap songs, and papers, eliminating the need for copywriters.
02:10
🤖
AI chatbots could replace search engines, while cheating on exams is a result of the school system valuing grades over learning.
03:44
🤔
Form opinions based on facts, not guesses.
05:49
🤖
AI can help, but it can't replace the creativity of humans.
07:23
🤖
Machines are taking over tasks that used to be done by humans, but AI still cannot create something distinct that has never been seen before.
09:43
🤔
The job market has changed drastically in the past 40 years, with new jobs and increased competition, so it's important to stay informed and adapt.
12:55
🤔
Adapt to the changing job market and focus on creating human jobs, not UBI, to combat AI taking over roles.
15:35
Николай Крупий
https://www.youtube.com/watch?v=MUOVnIbTZeA AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape. 🤑 Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI. Click to expand…
YouTube (0bce05e41c2049f2af336ca18543923a)
(11007) “Silicone Valley People Will Lose Their Jobs!” - Reaction…
“Silicone Valley People Will Lose Their Jobs!” - Reaction To #OpenAI Being A $29 Billion Company
https://www.youtube.com/watch?v=MUOVnIbTZeA
AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape.
🤑
Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI.
Click to expand
00:00
🤖
This technology can write code, rap songs, and papers, eliminating the need for copywriters.
02:10
🤖
AI chatbots could replace search engines, while cheating on exams is a result of the school system valuing grades over learning.
03:44
🤔
Form opinions based on facts, not guesses.
05:49
🤖
AI can help, but it can't replace the creativity of humans.
07:23
🤖
Machines are taking over tasks that used to be done by humans, but AI still cannot create something distinct that has never been seen before.
09:43
🤔
The job market has changed drastically in the past 40 years, with new jobs and increased competition, so it's important to stay informed and adapt.
12:55
🤔
Adapt to the changing job market and focus on creating human jobs, not UBI, to combat AI taking over roles.
15:35
https://www.youtube.com/watch?v=MUOVnIbTZeA
AI is rapidly changing the job market, so it is important to stay informed and adapt to the changing landscape.
🤑
Open AI has gone from 0 to a $29B company in 6 weeks, showing the potential of AI.
Click to expand
00:00
🤖
This technology can write code, rap songs, and papers, eliminating the need for copywriters.
02:10
🤖
AI chatbots could replace search engines, while cheating on exams is a result of the school system valuing grades over learning.
03:44
🤔
Form opinions based on facts, not guesses.
05:49
🤖
AI can help, but it can't replace the creativity of humans.
07:23
🤖
Machines are taking over tasks that used to be done by humans, but AI still cannot create something distinct that has never been seen before.
09:43
🤔
The job market has changed drastically in the past 40 years, with new jobs and increased competition, so it's important to stay informed and adapt.
12:55
🤔
Adapt to the changing job market and focus on creating human jobs, not UBI, to combat AI taking over roles.
15:35
Forwarded from Николай Крупий
Let's build GPT: from scratch, in code, spelled out.
Andrej Karpathy
97,7 тыс. подписчиков
445 049 просмотров 17 янв. 2023 г.
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write a GPT (meta :D!) . I recommend people watch the earlier makemore videos to get comfortable with the autoregressive language modeling framework and basics of tensors and PyTorch nn, which we take for granted in this video.
Links:
- Google colab for the video: https://colab.research.google.com/dri...
- GitHub repo for the video: https://github.com/karpathy/ng-video-...
- Playlist of the whole Zero to Hero series so far: https://www.youtube.com/watch?v=VMj-3...
- nanoGPT repo: https://github.com/karpathy/nanoGPT
- my website: https://karpathy.ai
- my twitter: https://twitter.com/karpathy
- our Discord channel: https://discord.gg/3zy8kqD9Cp
Supplementary links:
- Attention is All You Need paper: https://arxiv.org/abs/1706.03762
- OpenAI GPT-3 paper: https://arxiv.org/abs/2005.14165
- OpenAI ChatGPT blog post: https://openai.com/blog/chatgpt/
- The GPU I'm training the model on is from Lambda GPU Cloud, I think the best and easiest way to spin up an on-demand GPU instance in the cloud that you can ssh to: https://lambdalabs.com . If you prefer to work in notebooks, I think the easiest path today is Google Colab.
Suggested exercises:
- EX1: The n-dimensional tensor mastery challenge: Combine the `Head` and `MultiHeadAttention` into one class that processes all the heads in parallel, treating the heads as another batch dimension (answer is in nanoGPT).
- EX2: Train the GPT on your own dataset of choice! What other data could be fun to blabber on about? (A fun advanced suggestion if you like: train a GPT to do addition of two numbers, i.e. a+b=c. You may find it helpful to predict the digits of c in reverse order, as the typical addition algorithm (that you're hoping it learns) would proceed right to left as it adds the numbers, keeping track of a carry along the way. You may want to modify the data loader to simply serve random problems and skip the generation of train.bin, val.bin. You may want to mask out the loss at the input positions of a+b that just specify the problem using y=-1 in the targets (see CrossEntropyLoss ignore_index). Does your Transformer learn to add? Especially on a validation set of addition problems it hasn't seen during training? Once you have this, swole doge project: build a calculator clone in GPT, for all of +-*/. Not an easy problem. You may need Chain of Thought traces.)
- EX3: Find a dataset that is very large, so large that you can't see a gap between train and val loss. Pretrain the transformer on this data, then initialize with that model and finetune it on tiny shakespeare with a smaller number of steps and lower learning rate. Can you obtain a lower validation loss by the use of pretraining?
- EX4: Read some transformer papers and implement one additional feature or change that people seem to use. Does it improve the performance of your GPT?
Andrej Karpathy
97,7 тыс. подписчиков
445 049 просмотров 17 янв. 2023 г.
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write a GPT (meta :D!) . I recommend people watch the earlier makemore videos to get comfortable with the autoregressive language modeling framework and basics of tensors and PyTorch nn, which we take for granted in this video.
Links:
- Google colab for the video: https://colab.research.google.com/dri...
- GitHub repo for the video: https://github.com/karpathy/ng-video-...
- Playlist of the whole Zero to Hero series so far: https://www.youtube.com/watch?v=VMj-3...
- nanoGPT repo: https://github.com/karpathy/nanoGPT
- my website: https://karpathy.ai
- my twitter: https://twitter.com/karpathy
- our Discord channel: https://discord.gg/3zy8kqD9Cp
Supplementary links:
- Attention is All You Need paper: https://arxiv.org/abs/1706.03762
- OpenAI GPT-3 paper: https://arxiv.org/abs/2005.14165
- OpenAI ChatGPT blog post: https://openai.com/blog/chatgpt/
- The GPU I'm training the model on is from Lambda GPU Cloud, I think the best and easiest way to spin up an on-demand GPU instance in the cloud that you can ssh to: https://lambdalabs.com . If you prefer to work in notebooks, I think the easiest path today is Google Colab.
Suggested exercises:
- EX1: The n-dimensional tensor mastery challenge: Combine the `Head` and `MultiHeadAttention` into one class that processes all the heads in parallel, treating the heads as another batch dimension (answer is in nanoGPT).
- EX2: Train the GPT on your own dataset of choice! What other data could be fun to blabber on about? (A fun advanced suggestion if you like: train a GPT to do addition of two numbers, i.e. a+b=c. You may find it helpful to predict the digits of c in reverse order, as the typical addition algorithm (that you're hoping it learns) would proceed right to left as it adds the numbers, keeping track of a carry along the way. You may want to modify the data loader to simply serve random problems and skip the generation of train.bin, val.bin. You may want to mask out the loss at the input positions of a+b that just specify the problem using y=-1 in the targets (see CrossEntropyLoss ignore_index). Does your Transformer learn to add? Especially on a validation set of addition problems it hasn't seen during training? Once you have this, swole doge project: build a calculator clone in GPT, for all of +-*/. Not an easy problem. You may need Chain of Thought traces.)
- EX3: Find a dataset that is very large, so large that you can't see a gap between train and val loss. Pretrain the transformer on this data, then initialize with that model and finetune it on tiny shakespeare with a smaller number of steps and lower learning rate. Can you obtain a lower validation loss by the use of pretraining?
- EX4: Read some transformer papers and implement one additional feature or change that people seem to use. Does it improve the performance of your GPT?
YouTube
Let's build GPT: from scratch, in code, spelled out.
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write…
Forwarded from Николай Крупий
Chapters:
00:00:00 intro: ChatGPT, Transformers, nanoGPT, Shakespeare
baseline language modeling, code setup
00:07:52 reading and exploring the data
00:09:28 tokenization, train/val split
00:14:27 data loader: batches of chunks of data
00:22:11 simplest baseline: bigram language model, loss, generation
00:34:53 training the bigram model
00:38:00 port our code to a script
Building the "self-attention"
00:42:13 version 1: averaging past context with for loops, the weakest form of aggregation
00:47:11 the trick in self-attention: matrix multiply as weighted aggregation
00:51:54 version 2: using matrix multiply
00:54:42 version 3: adding softmax
00:58:26 minor code cleanup
01:00:18 positional encoding
01:02:00 THE CRUX OF THE VIDEO: version 4: self-attention
01:11:38 note 1: attention as communication
01:12:46 note 2: attention has no notion of space, operates over sets
01:13:40 note 3: there is no communication across batch dimension
01:14:14 note 4: encoder blocks vs. decoder blocks
01:15:39 note 5: attention vs. self-attention vs. cross-attention
01:16:56 note 6: "scaled" self-attention. why divide by sqrt(head_size)
Building the Transformer
01:19:11 inserting a single self-attention block to our network
01:21:59 multi-headed self-attention
01:24:25 feedforward layers of transformer block
01:26:48 residual connections
01:32:51 layernorm (and its relationship to our previous batchnorm)
01:37:49 scaling up the model! creating a few variables. adding dropout
Notes on Transformer
01:42:39 encoder vs. decoder vs. both (?) Transformers
01:46:22 super quick walkthrough of nanoGPT, batched multi-headed self-attention
01:48:53 back to ChatGPT, GPT-3, pretraining vs. finetuning, RLHF
01:54:32 conclusions
00:00:00 intro: ChatGPT, Transformers, nanoGPT, Shakespeare
baseline language modeling, code setup
00:07:52 reading and exploring the data
00:09:28 tokenization, train/val split
00:14:27 data loader: batches of chunks of data
00:22:11 simplest baseline: bigram language model, loss, generation
00:34:53 training the bigram model
00:38:00 port our code to a script
Building the "self-attention"
00:42:13 version 1: averaging past context with for loops, the weakest form of aggregation
00:47:11 the trick in self-attention: matrix multiply as weighted aggregation
00:51:54 version 2: using matrix multiply
00:54:42 version 3: adding softmax
00:58:26 minor code cleanup
01:00:18 positional encoding
01:02:00 THE CRUX OF THE VIDEO: version 4: self-attention
01:11:38 note 1: attention as communication
01:12:46 note 2: attention has no notion of space, operates over sets
01:13:40 note 3: there is no communication across batch dimension
01:14:14 note 4: encoder blocks vs. decoder blocks
01:15:39 note 5: attention vs. self-attention vs. cross-attention
01:16:56 note 6: "scaled" self-attention. why divide by sqrt(head_size)
Building the Transformer
01:19:11 inserting a single self-attention block to our network
01:21:59 multi-headed self-attention
01:24:25 feedforward layers of transformer block
01:26:48 residual connections
01:32:51 layernorm (and its relationship to our previous batchnorm)
01:37:49 scaling up the model! creating a few variables. adding dropout
Notes on Transformer
01:42:39 encoder vs. decoder vs. both (?) Transformers
01:46:22 super quick walkthrough of nanoGPT, batched multi-headed self-attention
01:48:53 back to ChatGPT, GPT-3, pretraining vs. finetuning, RLHF
01:54:32 conclusions
Forwarded from Инжиниринг Данных (Dmitry)
Open AI заключили эксклюзивное партнерство с Microsoft.
Open AI будет использовать Microsoft Azure для своей инфраструктуры. Основные продукты сейчас это GitHub Copilot, ChatGPT, Dalle2.
Так же Open AI развивает направление supercomputers - Microsoft announces new supercomputer, lays out vision for future AI work.
Ну и конечно Responsible AI, в главе 13 Designing Data Intensive Applications очень здорово написано, почему нужно делать responsible software.
PS Вчера мне снова помог ChatGPT. У меня есть CI/CD pipeline для Amazon Glue (Spark), который запускает в docker Glue Image и выполняет PyTest для каждого unit test. После добавления новой логике в код pytest стал падать в CI/CD pipeline. Вся команда билась целый день, как увеличить память контейнера внутри CI/CD gitlab runner, даже заменили инстанс с 8gb оперативки на 32gb оперативки, но все равно падало. Под вечер отчаявшись, я скопировал ошибку out of memory в chatgpt и товарищ выдал мне, что нужно для Spark добавить "—conf" с параметрами "executer" и "driver" memory. Ах, Семен Семеныч, целый день тыкали докер и gitlab, а оказалось надо добавить пару строк в Spark submit. (5 инженеров и DevOps не смогли додуматься, а AI сразу сказал, что делать).
Ссылки по теме:
1. Develop and test AWS Glue version 3.0 and 4.0 jobs locally using a Docker container
2. GitLab Runner
Вообще мне очень нравится мое решение на Glue, в котором я использую Git tags, terraform. Для каждой среды, у меня свой Glue job, созданный в terraform и использующий нужный python файл с правильным tag. Например,
glue_v1.5.0.py - production
glue_v1.5.1_3894hg.py - dev/stage
где v1.5.0 - git tag release после merge, а v1.5.1_3894hg tag, который еще не merge в моем branch, но имеет "3894hg" commit Id. Таким образом, каждое изменение я могу тестировать отдельно и финальную версию буду релизить через Terraform.
Было бы классно такой проектик для модуля 7.
Еще ссылки:
Git Basics - Tagging
Terraform Resource: aws_glue_job
PS Кстати поздравляю одну из читательниц, которую взяли в компанию, где внедрено такое решение Glue, AWS, Terraform, можешь уже начинать on-boarding;)
Так что, наше сообщество работает хорошо и помогает.
PPS Я использовал ChatGPT в Сиэтле, чтобы оспорить штраф за паркову, написал письмо и отправил им по почте, посмотрим как пойдет)
Open AI будет использовать Microsoft Azure для своей инфраструктуры. Основные продукты сейчас это GitHub Copilot, ChatGPT, Dalle2.
Так же Open AI развивает направление supercomputers - Microsoft announces new supercomputer, lays out vision for future AI work.
Ну и конечно Responsible AI, в главе 13 Designing Data Intensive Applications очень здорово написано, почему нужно делать responsible software.
PS Вчера мне снова помог ChatGPT. У меня есть CI/CD pipeline для Amazon Glue (Spark), который запускает в docker Glue Image и выполняет PyTest для каждого unit test. После добавления новой логике в код pytest стал падать в CI/CD pipeline. Вся команда билась целый день, как увеличить память контейнера внутри CI/CD gitlab runner, даже заменили инстанс с 8gb оперативки на 32gb оперативки, но все равно падало. Под вечер отчаявшись, я скопировал ошибку out of memory в chatgpt и товарищ выдал мне, что нужно для Spark добавить "—conf" с параметрами "executer" и "driver" memory. Ах, Семен Семеныч, целый день тыкали докер и gitlab, а оказалось надо добавить пару строк в Spark submit. (5 инженеров и DevOps не смогли додуматься, а AI сразу сказал, что делать).
Ссылки по теме:
1. Develop and test AWS Glue version 3.0 and 4.0 jobs locally using a Docker container
2. GitLab Runner
Вообще мне очень нравится мое решение на Glue, в котором я использую Git tags, terraform. Для каждой среды, у меня свой Glue job, созданный в terraform и использующий нужный python файл с правильным tag. Например,
glue_v1.5.0.py - production
glue_v1.5.1_3894hg.py - dev/stage
где v1.5.0 - git tag release после merge, а v1.5.1_3894hg tag, который еще не merge в моем branch, но имеет "3894hg" commit Id. Таким образом, каждое изменение я могу тестировать отдельно и финальную версию буду релизить через Terraform.
Было бы классно такой проектик для модуля 7.
Еще ссылки:
Git Basics - Tagging
Terraform Resource: aws_glue_job
PS Кстати поздравляю одну из читательниц, которую взяли в компанию, где внедрено такое решение Glue, AWS, Terraform, можешь уже начинать on-boarding;)
Так что, наше сообщество работает хорошо и помогает.
PPS Я использовал ChatGPT в Сиэтле, чтобы оспорить штраф за паркову, написал письмо и отправил им по почте, посмотрим как пойдет)
The Official Microsoft Blog
Microsoft and OpenAI extend partnership
Today, we are announcing the third phase of our long-term partnership with OpenAI through a multiyear, multibillion dollar investment to accelerate AI breakthroughs to ensure these benefits are broadly shared with the world. This agreement follows our previous…
Forwarded from Николай Крупий
Николай Крупий
Let's build GPT: from scratch, in code, spelled out. Andrej Karpathy 97,7 тыс. подписчиков 445 049 просмотров 17 янв. 2023 г. We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3.…
https://www.youtube.com/watch?v=kCc8FmEb1nY
Let's build GPT: from scratch, in code, spelled out.
Andrej Karpathy
97,7 тыс. подписчиков
Подписаться
20 тыс.
Summarize
Создать клип
Сохранить
445 049 просмотров 17 янв. 2023 г.
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write a GPT (meta :D!) . I recommend people watch the earlier makemore videos to get comfortable with the autoregressive language modeling framework and basics of tensors and PyTorch nn, which we take for granted in this video.
Links:
- Google colab for the video: https://colab.research.google.com/dri...
- GitHub repo for the video: https://github.com/karpathy/ng-video-...
- Playlist of the whole Zero to Hero series so far: https://www.youtube.com/watch?v=VMj-3...
- nanoGPT repo: https://github.com/karpathy/nanoGPT
- my website: https://karpathy.ai
- my twitter: https://twitter.com/karpathy
- our Discord channel: https://discord.gg/3zy8kqD9Cp
Supplementary links:
- Attention is All You Need paper: https://arxiv.org/abs/1706.03762
- OpenAI GPT-3 paper: https://arxiv.org/abs/2005.14165
- OpenAI ChatGPT blog post: https://openai.com/blog/chatgpt/
- The GPU I'm training the model on is from Lambda GPU Cloud, I think the best and easiest way to spin up an on-demand GPU instance in the cloud that you can ssh to: https://lambdalabs.com . If you prefer to work in notebooks, I think the easiest path today is Google Colab.
Suggested exercises:
- EX1: The n-dimensional tensor mastery challenge: Combine the
- EX2: Train the GPT on your own dataset of choice! What other data could be fun to blabber on about? (A fun advanced suggestion if you like: train a GPT to do addition of two numbers, i.e. a+b=c. You may find it helpful to predict the digits of c in reverse order, as the typical addition algorithm (that you're hoping it learns) would proceed right to left as it adds the numbers, keeping track of a carry along the way. You may want to modify the data loader to simply serve random problems and skip the generation of train.bin, val.bin. You may want to mask out the loss at the input positions of a+b that just specify the problem using y=-1 in the targets (see CrossEntropyLoss ignore_index). Does your Transformer learn to add? Especially on a validation set of addition problems it hasn't seen during training? Once you have this, swole doge project: build a calculator clone in GPT, for all of +-*/. Not an easy problem. You may need Chain of Thought traces.)
- EX3: Find a dataset that is very large, so large that you can't see a gap between train and val loss. Pretrain the transformer on this data, then initialize with that model and finetune it on tiny shakespeare with a smaller number of steps and lower learning rate. Can you obtain a lower validation loss by the use of pretraining?
- EX4: Read some transformer papers and implement one additional feature or change that people seem to use. Does it improve the performance of your GPT?
Let's build GPT: from scratch, in code, spelled out.
Andrej Karpathy
97,7 тыс. подписчиков
Подписаться
20 тыс.
Summarize
Создать клип
Сохранить
445 049 просмотров 17 янв. 2023 г.
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write a GPT (meta :D!) . I recommend people watch the earlier makemore videos to get comfortable with the autoregressive language modeling framework and basics of tensors and PyTorch nn, which we take for granted in this video.
Links:
- Google colab for the video: https://colab.research.google.com/dri...
- GitHub repo for the video: https://github.com/karpathy/ng-video-...
- Playlist of the whole Zero to Hero series so far: https://www.youtube.com/watch?v=VMj-3...
- nanoGPT repo: https://github.com/karpathy/nanoGPT
- my website: https://karpathy.ai
- my twitter: https://twitter.com/karpathy
- our Discord channel: https://discord.gg/3zy8kqD9Cp
Supplementary links:
- Attention is All You Need paper: https://arxiv.org/abs/1706.03762
- OpenAI GPT-3 paper: https://arxiv.org/abs/2005.14165
- OpenAI ChatGPT blog post: https://openai.com/blog/chatgpt/
- The GPU I'm training the model on is from Lambda GPU Cloud, I think the best and easiest way to spin up an on-demand GPU instance in the cloud that you can ssh to: https://lambdalabs.com . If you prefer to work in notebooks, I think the easiest path today is Google Colab.
Suggested exercises:
- EX1: The n-dimensional tensor mastery challenge: Combine the
Head and MultiHeadAttention into one class that processes all the heads in parallel, treating the heads as another batch dimension (answer is in nanoGPT).- EX2: Train the GPT on your own dataset of choice! What other data could be fun to blabber on about? (A fun advanced suggestion if you like: train a GPT to do addition of two numbers, i.e. a+b=c. You may find it helpful to predict the digits of c in reverse order, as the typical addition algorithm (that you're hoping it learns) would proceed right to left as it adds the numbers, keeping track of a carry along the way. You may want to modify the data loader to simply serve random problems and skip the generation of train.bin, val.bin. You may want to mask out the loss at the input positions of a+b that just specify the problem using y=-1 in the targets (see CrossEntropyLoss ignore_index). Does your Transformer learn to add? Especially on a validation set of addition problems it hasn't seen during training? Once you have this, swole doge project: build a calculator clone in GPT, for all of +-*/. Not an easy problem. You may need Chain of Thought traces.)
- EX3: Find a dataset that is very large, so large that you can't see a gap between train and val loss. Pretrain the transformer on this data, then initialize with that model and finetune it on tiny shakespeare with a smaller number of steps and lower learning rate. Can you obtain a lower validation loss by the use of pretraining?
- EX4: Read some transformer papers and implement one additional feature or change that people seem to use. Does it improve the performance of your GPT?
YouTube
Let's build GPT: from scratch, in code, spelled out.
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write…
Forwarded from Николай Крупий
Николай Крупий
Let's build GPT: from scratch, in code, spelled out. Andrej Karpathy 97,7 тыс. подписчиков 445 049 просмотров 17 янв. 2023 г. We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3.…
Chapters:
00:00:00 intro: ChatGPT, Transformers, nanoGPT, Shakespeare
baseline language modeling, code setup
00:07:52 reading and exploring the data
00:09:28 tokenization, train/val split
00:14:27 data loader: batches of chunks of data
00:22:11 simplest baseline: bigram language model, loss, generation
00:34:53 training the bigram model
00:38:00 port our code to a script
Building the "self-attention"
00:42:13 version 1: averaging past context with for loops, the weakest form of aggregation
00:47:11 the trick in self-attention: matrix multiply as weighted aggregation
00:51:54 version 2: using matrix multiply
00:54:42 version 3: adding softmax
00:58:26 minor code cleanup
01:00:18 positional encoding
01:02:00 THE CRUX OF THE VIDEO: version 4: self-attention
01:11:38 note 1: attention as communication
01:12:46 note 2: attention has no notion of space, operates over sets
01:13:40 note 3: there is no communication across batch dimension
01:14:14 note 4: encoder blocks vs. decoder blocks
01:15:39 note 5: attention vs. self-attention vs. cross-attention
01:16:56 note 6: "scaled" self-attention. why divide by sqrt(head_size)
Building the Transformer
01:19:11 inserting a single self-attention block to our network
01:21:59 multi-headed self-attention
01:24:25 feedforward layers of transformer block
01:26:48 residual connections
01:32:51 layernorm (and its relationship to our previous batchnorm)
01:37:49 scaling up the model! creating a few variables. adding dropout
Notes on Transformer
01:42:39 encoder vs. decoder vs. both (?) Transformers
01:46:22 super quick walkthrough of nanoGPT, batched multi-headed self-attention
01:48:53 back to ChatGPT, GPT-3, pretraining vs. finetuning, RLHF
01:54:32 conclusions
Corrections:
00:57:00 Oops "tokens from the future cannot communicate", not "past". Sorry! 🙂
00:00:00 intro: ChatGPT, Transformers, nanoGPT, Shakespeare
baseline language modeling, code setup
00:07:52 reading and exploring the data
00:09:28 tokenization, train/val split
00:14:27 data loader: batches of chunks of data
00:22:11 simplest baseline: bigram language model, loss, generation
00:34:53 training the bigram model
00:38:00 port our code to a script
Building the "self-attention"
00:42:13 version 1: averaging past context with for loops, the weakest form of aggregation
00:47:11 the trick in self-attention: matrix multiply as weighted aggregation
00:51:54 version 2: using matrix multiply
00:54:42 version 3: adding softmax
00:58:26 minor code cleanup
01:00:18 positional encoding
01:02:00 THE CRUX OF THE VIDEO: version 4: self-attention
01:11:38 note 1: attention as communication
01:12:46 note 2: attention has no notion of space, operates over sets
01:13:40 note 3: there is no communication across batch dimension
01:14:14 note 4: encoder blocks vs. decoder blocks
01:15:39 note 5: attention vs. self-attention vs. cross-attention
01:16:56 note 6: "scaled" self-attention. why divide by sqrt(head_size)
Building the Transformer
01:19:11 inserting a single self-attention block to our network
01:21:59 multi-headed self-attention
01:24:25 feedforward layers of transformer block
01:26:48 residual connections
01:32:51 layernorm (and its relationship to our previous batchnorm)
01:37:49 scaling up the model! creating a few variables. adding dropout
Notes on Transformer
01:42:39 encoder vs. decoder vs. both (?) Transformers
01:46:22 super quick walkthrough of nanoGPT, batched multi-headed self-attention
01:48:53 back to ChatGPT, GPT-3, pretraining vs. finetuning, RLHF
01:54:32 conclusions
Corrections:
00:57:00 Oops "tokens from the future cannot communicate", not "past". Sorry! 🙂