Henok | Neural Nets

100k is a lotttt.

😁8🤔1

1.13K views15:50

This media is not supported in your browser

Hasab AI 🔥

Here is a great practical use case of ML in Ethiopia. The inference is really optimized 🔥.

Take a look at the demo.

https://www.hasab.ai/

🔥28

1.29K views17:24

Henok | Neural Nets

Why is the field of robotics so slow in the past 15 years?

Boston Dynamics is the only one I can think of.

💯6

1.34K views15:17

Henok | Neural Nets

If you are interested in Computer Vision make sure to apply here. It's a great program and I was a TA there last year and learned a lot too and met some cool people. The curriculum emphasizes the importance of ethical considerations, geometry and math, deep…

Started reviewing applicants today. So many great people.

1.27K views15:22

Henok | Neural Nets

Let's have fun😁

🤣27😁7

1.39K views07:41

Henok | Neural Nets

Text to Bark, what a breakthrough in human civilization.

https://x.com/elevenlabsio/status/1907014022009876508

This got me fooled tbh

X (formerly Twitter)

ElevenLabs (@elevenlabsio) on X

We pioneered the first ultra-realistic Text to Speech model, and recently launched the world's most accurate Speech to Text model, Scribe.

But we're not stopping there.

Today, we're taking one small step for man, and one giant leap for man's best friend...…

🤣10

1.36K views11:43

Henok | Neural Nets

take it easy bro, where is the fun :)

😁3

1.24K views12:33

Henok | Neural Nets

My favorite open source model series. The Llama 4 herd is out

https://ai.meta.com/blog/llama-4-multimodal-intelligence/

🔥11

1.17K views20:27

Henok | Neural Nets

What are some *practical* ways for reducing input tokens (e.g. chunking, summarizing, selective filtering) ? Just looking for something that worked well.

1.09K viewsedited 13:56

Henok | Neural Nets

😔

965 views06:12

Henok | Neural Nets

Forwarded from Debugging Epohul (epohul)

I'm craving a PhD

❤‍🔥3

780 views19:14

Henok | Neural Nets

Anthropic’s Evaluation of Chain-of-Thought Faithfulness

https://www.marktechpost.com/2025/04/05/anthropics-evaluation-of-chain-of-thought-faithfulness-investigating-hidden-reasoning-reward-hacks-and-the-limitations-of-verbal-ai-transparency-in-reasoning-models/

MarkTechPost

Anthropic’s Evaluation of Chain-of-Thought Faithfulness: Investigating Hidden Reasoning, Reward Hacks, and the Limitations of Verbal…

A key advancement in AI capabilities is the development and use of chain-of-thought (CoT) reasoning, where models explain their steps before reaching an answer. This structured intermediate reasoning is not just a performance tool; it’s also expected to enhance…

1.02K views19:24

Henok | Neural Nets

I'm not signing a doc to access a so called "open source dataset", it's also very clear that i won't be able to develop a model with such small data, let alone use for commercial purposes.

😁9😭2🥱1

923 views05:35

Henok | Neural Nets

Where is EthioNLP? Probably one of the leading initiatives in Ethiopian NLP, Open source datasets and models, and the best NLP community in Ethiopia. They have listed Ghana NLP but not EthioNLP😂 this is hilarious.

https://www.gsma.com/solutions-and-impact/connectivity-for-good/mobile-for-development/wp-content/uploads/2025/04/AI-in-Ethiopia.pdf

🤣5😢2

1.09K views13:08

Henok | Neural Nets

Oh wow, Deepseek is getting some hit back. Probably it's political

🤔4

991 viewsedited 19:11

Henok | Neural Nets

Forwarded from Frectonz

Devtopia #003 is out.

This episode's guest is Fuad, a senior DevOps engineer. We had a conversation about Kubernetes, Docker, CI/CD, consensus algorithms and a whole lot more.

[youtube]

YouTube

Devtopia - E03 - Fuad (DevOps Eng. SWE) - Kubernetes, distributed systems FULL EPISODE

🎙️ Devtopia — Fuad Joins Fraol & Yafet for a Deep Dive into Scalable Systems, Networking, and More!

In this episode of Devtopia, hosts Fraol and Yafet sit down with special guest Fuad to explore the tech that powers today’s most reliable and scalable distributed…

🔥4

695 views14:39

Henok | Neural Nets

Gemma-3-27b-it Parameter Breakdown

| Component | Parameters | Percent
|:---------------|------------------|----------
| Feed-Forward | 21,770,514,288 | 79.36%
| Attention | 4,239,205,376 | 15.45%
| Embedding | 1,415,027,328 | 5.16%
| Other | 6,324,096 | 0.02%
| LayerNorm | 1,335,552 | 0.00%

Total Trainable Parameters: 27,432,406,640 (27.4B🤯)

So the model architecture

- They've 27 Siglip vision transformer layers with self-attention and MLP blocks. The vision part heavily influences multi-modal capabilities, combining visual context with linguistic understanding.

- The language model architecture got 62 Gemma3DecoderLayers, each featuring sophisticated self-attention with rotary embeddings, intricate RMS normalizations, and extensive MLP layers for robust textual modeling.

I'll write about it in depth about each of those and compare it with other models and why it was able to work on single gpu.

🔥4👍1🤯1

1.09K viewsedited 04:58

Henok | Neural Nets