Dagmawi Babi
Wake up babe, Andrej Karpathy just dropped a new video about LLMs • https://youtu.be/zjkBMFhNj_g?si=Ep2yJ81MenrKG0RQ #AndrejKarpathy #AI #Tutorials #YouTube @Dagmawi_Babi
Intro to LLM by Andrej Karpathy.pdf
41.5 MB
Just watched his lecture and damn it's so good. So much to think about. Here's his powerpoint.
#ML #AI #AndrejKarpathy #LLM
@Dagmawi_Babi
#ML #AI #AndrejKarpathy #LLM
@Dagmawi_Babi
This media is not supported in your browser
VIEW IN TELEGRAM
This media is not supported in your browser
VIEW IN TELEGRAM
This media is not supported in your browser
VIEW IN TELEGRAM
This media is not supported in your browser
VIEW IN TELEGRAM
This media is not supported in your browser
VIEW IN TELEGRAM
Devin is marketed as the first AI software engineer, but don't let that throw you off. Though It's coding and debugging capabilities are insane, it's still has a long way to go AND will come to a stop.
It's great but I know it'll replace devs, What most people don't understand is that these models are LLMs. They're statistical predictive models that don't understand what they're doing. So just like how ChatGPT hasn't stopped all the authors of the world, this won't stop the coders too.
If you don't fully understand what I am talking about please watch Lex's podcast with Yan Lecun.
But not to throw shade at the party, this's still epic fine tuning and epic demo of an LLM.
#Devin #AI #ML #LLM
@Dagmawi_Babi
It's great but I know it'll replace devs, What most people don't understand is that these models are LLMs. They're statistical predictive models that don't understand what they're doing. So just like how ChatGPT hasn't stopped all the authors of the world, this won't stop the coders too.
If you don't fully understand what I am talking about please watch Lex's podcast with Yan Lecun.
But not to throw shade at the party, this's still epic fine tuning and epic demo of an LLM.
#Devin #AI #ML #LLM
@Dagmawi_Babi
xAI's Grok LLM has been open sourced
They released the base model weights and network architecture. It also shows that the model has 314 billion parameters and is a mixture of experts model. And it is not fine tuned alot. So chatting with it needs a bit of a config and fine tuning.
For a 314B parameter model, it is very undertrained. The benchmarks show that the model is of the GPT 3.5 level. GPT 3.5 is estimated to be a 20B parameter model.
Repo
• https://github.com/xai-org/grok-1
Weights
• https://huggingface.co/xai-org/grok-1/tree/main/ckpt
#Grok #LLM #AI #ML
@Dagmawi_Babi
They released the base model weights and network architecture. It also shows that the model has 314 billion parameters and is a mixture of experts model. And it is not fine tuned alot. So chatting with it needs a bit of a config and fine tuning.
For a 314B parameter model, it is very undertrained. The benchmarks show that the model is of the GPT 3.5 level. GPT 3.5 is estimated to be a 20B parameter model.
Repo
• https://github.com/xai-org/grok-1
Weights
• https://huggingface.co/xai-org/grok-1/tree/main/ckpt
#Grok #LLM #AI #ML
@Dagmawi_Babi
@Naklecha released a repo that implements LLAMA3 from scratch -- every matrix multiplication from attention across multiple heads, positional encoding and every other layer in between has been carefully unwrapped & explained.
• https://github.com/naklecha/llama3-from-scratch
🔥 🔥 🔥
#LLAMA3 #AI #ML #LLM
@Dagmawi_Babi
• https://github.com/naklecha/llama3-from-scratch
#LLAMA3 #AI #ML #LLM
@Dagmawi_Babi
Please open Telegram to view this post
VIEW IN TELEGRAM
Media is too big
VIEW IN TELEGRAM
Found this LLM Visualization tool in the Web 🤯
• bbycroft.net/llm
This's a huge resource to learn about GPT models and how they work visually.
This took so much effort from a solo dev and I can't believe it's open source.
• github.com/bbycroft/llm-viz
#Visualization #Resources #GPT #LLM #OSS
@Dagmawi_Babi
• bbycroft.net/llm
This's a huge resource to learn about GPT models and how they work visually.
This took so much effort from a solo dev and I can't believe it's open source.
• github.com/bbycroft/llm-viz
#Visualization #Resources #GPT #LLM #OSS
@Dagmawi_Babi
Well this's very impressive!
Cerebras Systems Inference is capable of serving LLAMA 3.1 70B at 450 tokens/sec and LLAMA 3.1 8B at 1,850 tokens/sec. I don't even know how this's possible tbh.
Try and see how fast it is
• inference.cerebras.ai
#CerebrasSystems #LLAMA #LLM #AIML
@Dagmawi_Babi
Cerebras Systems Inference is capable of serving LLAMA 3.1 70B at 450 tokens/sec and LLAMA 3.1 8B at 1,850 tokens/sec. I don't even know how this's possible tbh.
Try and see how fast it is
• inference.cerebras.ai
#CerebrasSystems #LLAMA #LLM #AIML
@Dagmawi_Babi
Also I am in love with Gemini Flash. It's soooo fast and compact yet so powerful it's so impressive.
Not to mention it's so cheap I could afford it.
#Google #Gemini #LLM #AIML
@Dagmawi_Babi
Not to mention it's so cheap I could afford it.
#Google #Gemini #LLM #AIML
@Dagmawi_Babi