Dagmawi Babi – Telegram

Dagmawi Babi

4.96K subscribers

13K photos

1.66K videos

239 files

1.74K links

Believer of Christ | Creative Developer.

Files Channel: https://t.me/+OZ9Ul_rSBAQ0MjNk

Community: @DagmawiBabiChat

Download Telegram

About

Blog

Apps

Platform

4.96K subscribers

Wake up babe, Andrej Karpathy just dropped a new video about LLMs • https://youtu.be/zjkBMFhNj_g?si=Ep2yJ81MenrKG0RQ #AndrejKarpathy #AI #Tutorials #YouTube @Dagmawi_Babi

Intro to LLM by Andrej Karpathy.pdf

Just watched his lecture and damn it's so good. So much to think about. Here's his powerpoint.

#ML #AI #AndrejKarpathy #LLM
@Dagmawi_Babi

606 views12:04

This media is not supported in your browser

VIEW IN TELEGRAM

This media is not supported in your browser

VIEW IN TELEGRAM

This media is not supported in your browser

VIEW IN TELEGRAM

This media is not supported in your browser

VIEW IN TELEGRAM

This media is not supported in your browser

VIEW IN TELEGRAM

Devin is marketed as the first AI software engineer, but don't let that throw you off. Though It's coding and debugging capabilities are insane, it's still has a long way to go AND will come to a stop.

It's great but I know it'll replace devs, What most people don't understand is that these models are LLMs. They're statistical predictive models that don't understand what they're doing. So just like how ChatGPT hasn't stopped all the authors of the world, this won't stop the coders too.

If you don't fully understand what I am talking about please watch Lex's podcast with Yan Lecun.

But not to throw shade at the party, this's still epic fine tuning and epic demo of an LLM.

#Devin #AI #ML #LLM
@Dagmawi_Babi

630 views20:10

xAI's Grok LLM has been open sourced

They released the base model weights and network architecture. It also shows that the model has 314 billion parameters and is a mixture of experts model. And it is not fine tuned alot. So chatting with it needs a bit of a config and fine tuning.

For a 314B parameter model, it is very undertrained. The benchmarks show that the model is of the GPT 3.5 level. GPT 3.5 is estimated to be a 20B parameter model.

Repo
• https://github.com/xai-org/grok-1

Weights
• https://huggingface.co/xai-org/grok-1/tree/main/ckpt

#Grok #LLM #AI #ML
@Dagmawi_Babi

887 views07:14

@Naklecha released a repo that implements LLAMA3 from scratch -- every matrix multiplication from attention across multiple heads, positional encoding and every other layer in between has been carefully unwrapped & explained.
• https://github.com/naklecha/llama3-from-scratch

🔥

🔥

🔥

#LLAMA3 #AI #ML #LLM
@Dagmawi_Babi

Please open Telegram to view this post

VIEW IN TELEGRAM

860 viewsedited 20:55

Media is too big

VIEW IN TELEGRAM

Found this LLM Visualization tool in the Web 🤯
• bbycroft.net/llm

This's a huge resource to learn about GPT models and how they work visually.

This took so much effort from a solo dev and I can't believe it's open source.
• github.com/bbycroft/llm-viz

#Visualization #Resources #GPT #LLM #OSS
@Dagmawi_Babi

798 views14:54

Well this's very impressive!

Cerebras Systems Inference is capable of serving LLAMA 3.1 70B at 450 tokens/sec and LLAMA 3.1 8B at 1,850 tokens/sec. I don't even know how this's possible tbh.

Try and see how fast it is
• inference.cerebras.ai

#CerebrasSystems #LLAMA #LLM #AIML
@Dagmawi_Babi

971 viewsedited 16:50

Also I am in love with Gemini Flash. It's soooo fast and compact yet so powerful it's so impressive.

Not to mention it's so cheap I could afford it.

#Google #Gemini #LLM #AIML
@Dagmawi_Babi

1.0K views09:20