If anyone is interested in pretraining any kind of text based models, you might have experienced that there isn't much you can do than just feeding the text itself in bulk. But there's more to it.
Here's some slides to gemini's pretraining and generally about pretraining. It's super compute intensive even for old decoder based models
https://vladfeinberg.com/assets/2025-04-24-princeton-talk.pdf
Gemini Pretraining
Here's some slides to gemini's pretraining and generally about pretraining. It's super compute intensive even for old decoder based models
https://vladfeinberg.com/assets/2025-04-24-princeton-talk.pdf
Gemini Pretraining
β€6π2
I'll be talking about running many kinds of open source models from small ~100M to 7B LLMs in the web, android and ios. Also about optimization & quantization, efficient finetuning techniques etc and all about local first AI development on April 29
For anyone interested link
For anyone interested link
π₯12π3
So PewDiePie is now miles ahead of me on Linux π. The only time I tried installing linux was in high school and it was kali and hated my experience so much.
So what's next for him, colab with primeagen and bash Javascript lol
So what's next for him, colab with primeagen and bash Javascript lol
π13
PhD at Inria Paris
If anyone have MS and got interest in Physics-Grounded Vision Foundation Models and interests+some experience in computer vision. DM me your CV and I'll make a referral to your application.
Raoul is a great research and amazing guy to work with and it will be super cool to see an Ethiopian in his lab.
https://astra-vision.github.io/jobs/
If anyone have MS and got interest in Physics-Grounded Vision Foundation Models and interests+some experience in computer vision. DM me your CV and I'll make a referral to your application.
Raoul is a great research and amazing guy to work with and it will be super cool to see an Ethiopian in his lab.
https://astra-vision.github.io/jobs/
astra-vision.github.io
Jobs | Astra-vision - Computer vision group, Astra, Inria
Computer vision group of the Astra research team, Inria, Paris.
π6
Why do I feel burnout in days when I'm doing nothing but sleep.
π17π€6π1
Incase anyone is interested to submit their papers to an Ethiopian NLP workshop. It's good to get some writing experience and also get into research.
https://ethionlp.github.io/index.html#call-for-papers
https://ethionlp.github.io/index.html#call-for-papers
β€4π₯1
Job Post
If anyone is interested in a kind of project involves sync engine, AI agent, vertical saas, DM me here @feeling_stoic.
It's a US startup and the guy running it super cool and you will learn a lot from him and enjoy working with him.
π°The offer is $800-$1000 + some equity.
Share it with others if you know anyone.
If anyone is interested in a kind of project involves sync engine, AI agent, vertical saas, DM me here @feeling_stoic.
It's a US startup and the guy running it super cool and you will learn a lot from him and enjoy working with him.
π°The offer is $800-$1000 + some equity.
Share it with others if you know anyone.
π9β€4β‘3
Henok | Neural Nets
Job Post If anyone is interested in a kind of project involves sync engine, AI agent, vertical saas, DM me here @feeling_stoic. It's a US startup and the guy running it super cool and you will learn a lot from him and enjoy working with him. π°The offerβ¦
Job ClosedβοΈ
I've shared your CVs with the founder and he will take care of it moving forward.
Good luck to you all.
I've shared your CVs with the founder and he will take care of it moving forward.
Good luck to you all.
π10π2
Oh no....students please verify you don't do such crap
https://futurism.com/college-students-ai-typos
https://futurism.com/college-students-ai-typos
π9
Forwarded from Beka (Beka)
We just did our YC launch. I need you guys to do me a favor. Can you go ahead and upvote please π
https://www.ycombinator.com/launches/NUm-better-auth-the-authentication-framework-for-typescript
https://www.ycombinator.com/launches/NUm-better-auth-the-authentication-framework-for-typescript
Y Combinator
Launch YC: Better Auth - The Authentication Framework for TypeScript | Y Combinator
The fastest growing Auth framework for TypeScript: 13K stars + 100K weekly downloads!
π₯6π1
If AI can, why shouldn't it take the entire software engineering jobs or a job of a research scientist?
I hate Lofi and i don't really get them. But if you need something while working just check Mulatu's Jazz. He's simply the best.
Here are my favorites
Yekermo Sew
https://youtu.be/jwdBRqIsVUY?si=X-7T9QIUiMsO4a5a
Tizita
https://youtu.be/sXLfV2kegUI?si=cjZdWw_FKUXhmXLi
Here are my favorites
Yekermo Sew
https://youtu.be/jwdBRqIsVUY?si=X-7T9QIUiMsO4a5a
Tizita
https://youtu.be/sXLfV2kegUI?si=cjZdWw_FKUXhmXLi
YouTube
Yèkèrmo Sèw
Provided to YouTube by K7 Records GmbH
Yèkèrmo Sèw · Mulatu Astatke
New York - Addis - London: The Story of Ethio Jazz 1965-1975
β 1969 Amha Records
Released on: 2009-10-19
Music Publisher: Copyright Control
Composer: Mulatu Astatke
Lyricist: Mulatuβ¦
Yèkèrmo Sèw · Mulatu Astatke
New York - Addis - London: The Story of Ethio Jazz 1965-1975
β 1969 Amha Records
Released on: 2009-10-19
Music Publisher: Copyright Control
Composer: Mulatu Astatke
Lyricist: Mulatuβ¦
β€13π₯3π1π€1
Religious benchmarks for LLM evaluation seems cool, I've not seen much work towards this. Are the best models of today biased against one religion, teaching, how would they interpret things.
Recommend me a paper if you've seen in this area, I'll be happy to read it.
Recommend me a paper if you've seen in this area, I'll be happy to read it.
β€7
Building LLMs from scratch has to be one of the challenging things I was in and very underrated. Pretraining data, how many parameters is enough, instruction fine tuning, making them generalize and alignment, all under resources constraints, even in big tech companies compute budget exists, this all is really hard.
So when ever a new model is out and they beat others on some areas is a huge W.
So one suggestion, if you don't have to, don't start from scratch and also expanding toknizer and updating model weights per user or something should be well studied to adapt models to new langs and tasks.
So when ever a new model is out and they beat others on some areas is a huge W.
So one suggestion, if you don't have to, don't start from scratch and also expanding toknizer and updating model weights per user or something should be well studied to adapt models to new langs and tasks.
π7β€3