Henok | Neural Nets
1.61K subscribers
233 photos
20 videos
13 files
157 links
Download Telegram
If anyone is interested in pretraining any kind of text based models, you might have experienced that there isn't much you can do than just feeding the text itself in bulk. But there's more to it.

Here's some slides to gemini's pretraining and generally about pretraining. It's super compute intensive even for old decoder based models


https://vladfeinberg.com/assets/2025-04-24-princeton-talk.pdf

Gemini Pretraining
❀6πŸ‘2
I'll be talking about running many kinds of open source models from small ~100M to 7B LLMs in the web, android and ios. Also about optimization & quantization, efficient finetuning techniques etc and all about local first AI development on April 29

For anyone interested link
πŸ”₯12πŸ‘3
So PewDiePie is now miles ahead of me on Linux πŸ˜‚. The only time I tried installing linux was in high school and it was kali and hated my experience so much.

So what's next for him, colab with primeagen and bash Javascript lol
😁13
PhD at Inria Paris

If anyone have MS and got interest in Physics-Grounded Vision Foundation Models and interests+some experience in computer vision. DM me your CV and I'll make a referral to your application.

Raoul is a great research and amazing guy to work with and it will be super cool to see an Ethiopian in his lab.

https://astra-vision.github.io/jobs/
πŸ‘6
Why do I feel burnout in days when I'm doing nothing but sleep.
😁17🀝6😭1
Incase anyone is interested to submit their papers to an Ethiopian NLP workshop. It's good to get some writing experience and also get into research.


https://ethionlp.github.io/index.html#call-for-papers
❀4πŸ”₯1
Job Post

If anyone is interested in a kind of project involves sync engine, AI agent, vertical saas, DM me here @feeling_stoic.

It's a US startup and the guy running it super cool and you will learn a lot from him and enjoy working with him.

πŸ’°The offer is $800-$1000 + some equity.

Share it with others if you know anyone.
πŸ‘9❀4⚑3
Oh no....students please verify you don't do such crap

https://futurism.com/college-students-ai-typos
😁9
Hakuna Matata !!!
πŸ”₯16❀3
Last statement is too bold
πŸ‘9🀨1
Gemini Diffusion is super fast. 564 tokens/s. It means it can write almost a 200 page book in ~3 minutes.
⚑5πŸ”₯4❀1
If AI can, why shouldn't it take the entire software engineering jobs or a job of a research scientist?
We don't care about AI, AGI, or whatever for tonight, let's cook this chicken, let's go UnitedπŸ”₯
πŸ”₯12🀣10😭3
We are back to AI
🀣18😁7πŸ’”2
Religious benchmarks for LLM evaluation seems cool, I've not seen much work towards this. Are the best models of today biased against one religion, teaching, how would they interpret things.

Recommend me a paper if you've seen in this area, I'll be happy to read it.
❀7
Building LLMs from scratch has to be one of the challenging things I was in and very underrated. Pretraining data, how many parameters is enough, instruction fine tuning, making them generalize and alignment, all under resources constraints, even in big tech companies compute budget exists, this all is really hard.

So when ever a new model is out and they beat others on some areas is a huge W.


So one suggestion, if you don't have to, don't start from scratch and also expanding toknizer and updating model weights per user or something should be well studied to adapt models to new langs and tasks.
πŸ‘7❀3