Henok | Neural Nets
1.61K subscribers
233 photos
20 videos
13 files
157 links
Download Telegram
Forwarded from Debugging Epohul (epohul)
I'm craving a PhD
❤‍🔥3
I'm not signing a doc to access a so called "open source dataset", it's also very clear that i won't be able to develop a model with such small data, let alone use for commercial purposes.
😁9😭2🥱1
Where is EthioNLP? Probably one of the leading initiatives in Ethiopian NLP, Open source datasets and models, and the best NLP community in Ethiopia. They have listed Ghana NLP but not EthioNLP😂 this is hilarious.

https://www.gsma.com/solutions-and-impact/connectivity-for-good/mobile-for-development/wp-content/uploads/2025/04/AI-in-Ethiopia.pdf
🤣5😢2
Oh wow, Deepseek is getting some hit back. Probably it's political
🤔4
Gemma-3-27b-it Parameter Breakdown

| Component | Parameters | Percent
|:---------------|------------------|----------
| Feed-Forward | 21,770,514,288 | 79.36%
| Attention | 4,239,205,376 | 15.45%
| Embedding | 1,415,027,328 | 5.16%
| Other | 6,324,096 | 0.02%
| LayerNorm | 1,335,552 | 0.00%

Total Trainable Parameters: 27,432,406,640 (27.4B🤯)

So the model architecture

- They've 27 Siglip vision transformer layers with self-attention and MLP blocks. The vision part heavily influences multi-modal capabilities, combining visual context with linguistic understanding.


- The language model architecture got 62 Gemma3DecoderLayers, each featuring sophisticated self-attention with rotary embeddings, intricate RMS normalizations, and extensive MLP layers for robust textual modeling.

I'll write about it in depth about each of those and compare it with other models and why it was able to work on single gpu.
🔥4👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
Every time my model training is almost done
😁21
💔8🤣5👍4😢2👌2
Btw someone enlighten me about the A2SV issue?

I think from what I saw online some people aren't a big fan of A2SV because it gets people to big tech companies than actually building big tech/silicon valley in Ethiopia😳?

Also on a different note Addis Ababa is place where the African Union Headquarter is located and so many international offices but not a single FAANG or in similar level company dared to open an office here, just unrelated story
👌5
I usually don't like this types of studies that tend to generalize more, but can someone confirm😂


https://www.forbes.com/sites/traversmark/2025/04/07/new-research-reveals-how-long-it-actually-takes-to-get-over-an-ex/
🤣8😁4
icog is probably one of the biggest game changers in Ethiopia's tech space. It's good to hear their history and struggles. I hope they make a movie out it. I've so much respect for the people I met from icog, I hope to see them rise again.

Btw I used to hate Hruy, he made fun of us back in the days when we competed for Robo Soccer, but nothing personal 😂, @bereket_ademe, I hope you remember it


https://youtu.be/NPkmaj6tOGo?si=qWgcHkz1qbWAr_d9
🔥5😁2
እንኳን ለብርሃነ ትንሣኤው በሰላም አደረሳችሁ
21🤗1
Can someone help me find an app/website to play Gebeta/ገበጣ ?
Computational Complexity of Air Travel Planning

I've never thought of it as this complex until I came across 2003 slide deck from ITA Software basically saying this.

So the problem is:
- With fixed fares and routes but variable flights, the problem is NP-hard.
- Fixing flights and fares while varying priceable units also leads to NP-hardness.
- In some formulations, the problem becomes undecidable, meaning no algorithm can solve all instances.

And because of that they solved the problem with basically dynamic programming, graph based search, fare combinatorics modeling, memoization, pruning heuristics, recursive decomposition, and Common Lisp for efficient symbolic computation.

For anyone interested link
🤯3👍21
Anthropic should just be an interpretability company!
👌8