This media is not supported in your browser
VIEW IN TELEGRAM
Microsoft just opensourced a blazing fast inference framework for 1-bit LLMS. Now you can run up to 100B LLMS locally on your CPU (GPU and NPU support coming in the future) and get 5-7 tokens/second.
https://github.com/microsoft/BitNet
https://github.com/microsoft/BitNet
Brilliant article on Arm's approach for the future, challenges (AI, Qualcomm and Apple turning into competitors, AMD and Intel joining forces to improve x86) and mini-interview to Christopher Bergey, senior VP and GM of the Client Line of Business at Arm. A very good read.
https://www.theregister.com/AMP/2024/10/22/arm_custom_silicon_interview/
https://www.theregister.com/AMP/2024/10/22/arm_custom_silicon_interview/
The Register
As Arm rivals cook up custom silicon, Mediatek sticks to tried-and-true Cortex recipe
Exec Chris Bergey tells us what the chip designer is doing to stay competitive
Thomas Eugene Kurtz, father of BASIC, has died peacefully today in his retirement home. He was 96. Thank you, Thomas, for enabling at least two generations of human beings to get to learn and use a personal computer.
This is truly the best time for retro gaming and emulation. Ten years ago I would have sworn we'd never see a working emulator for extremely complex systems such as the PS3 and its IBM Cell, a PowerPC-based CPU. Now RPCS3 runs even on ARM64 CPUs, including Apple Silicon, thus opening a new landscape for PS3 emulation even on extremely portable devices.
For more information on the specific RPCS3 case, read this blog post on their website.
For more information on the specific RPCS3 case, read this blog post on their website.
Interesting article on AMD's quite incredible feat to run crude C++ code on their GPUs for AI workloads, thus threatening the de-facto monopoly Nvidia has with CUDA and their GPUs (biggest market cap in the world).
https://www.phoronix.com/news/AMD-Standard-C-Code-GPUs
https://www.phoronix.com/news/AMD-Standard-C-Code-GPUs
Phoronix
How AMD Is Taking Standard C/C++ Code To Run Directly On GPUs
Back at the 2024 LLVM Developers' Meeting was an interesting presentation by AMD engineer Joseph Huber for how they have been exploring running common, standard C/C++ code directly on GPUs without having to be adapted for any GPU language / programming dialects…
🎄2
The research paper "All You Need Is Attention" from the Google Research Team that came up with transformers (which radically changed and improved artificial intelligence and gave rise to genAI as we know it) might have found its successor.
https://arxiv.org/abs/2501.00663
Some good discussion and interpretation of results can be found in the comments section here.
https://arxiv.org/abs/2501.00663
Some good discussion and interpretation of results can be found in the comments section here.
arXiv.org
Titans: Learning to Memorize at Test Time
Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size...
Imagine playing a simple game of Hangman with an AI.
You'd think that explaining the game mechanics step-by-step would be enough, but my favorite benchmark test for Large Language Models (LLMs) has consistently highlighted their limitations.
Most LLMs can't successfully play Hangman—even after a detailed explanation. They end up accepting all guesses as correct, fail to understand the turn-based nature of the game, and often mix up who should come up with the blank word or provide guesses. In my recent tests, between the commercially available LLMs, only ChatGPT 4o has been able to play Hangman correctly.
This quirky but telling failure shows that while these models can seem incredibly smart, there's still a big gap between surface-level understanding and actually grasping logical, step-by-step processes. Read more about it in my latest post on Substack.
You'd think that explaining the game mechanics step-by-step would be enough, but my favorite benchmark test for Large Language Models (LLMs) has consistently highlighted their limitations.
Most LLMs can't successfully play Hangman—even after a detailed explanation. They end up accepting all guesses as correct, fail to understand the turn-based nature of the game, and often mix up who should come up with the blank word or provide guesses. In my recent tests, between the commercially available LLMs, only ChatGPT 4o has been able to play Hangman correctly.
This quirky but telling failure shows that while these models can seem incredibly smart, there's still a big gap between surface-level understanding and actually grasping logical, step-by-step processes. Read more about it in my latest post on Substack.
Substack
Hangman and Circles
How LLMs struggle with Simple Reasoning
New post on Substack. Just in time for a good lunch read, I suppose. Don't miss it!
https://atsetilam.substack.com/p/the-fabled-ai-on-the-edge
https://atsetilam.substack.com/p/the-fabled-ai-on-the-edge
Asahi Lina left the Asahi Linux project for “personal reasons”. We lost a fantastic developer who was focused on an incredibly difficult and extremely important work - bringing Linux onto the Apple Silicon platform. Such a pity. She says she is “safe”, luckily.
😢3
"Model Context Protocol (MCP) is an AI tool calling standard that has been rapidly gaining adoption over the past few months. MCP tools give LLMs a standardized way to call functions, look up data, and interact with the world. Anthropic created the protocol and built the first GitHub MCP server, which grew to be one of the most popular MCP servers in the expanding ecosystem. We are excited to take ownership of the server and continue its development."
https://github.blog/changelog/2025-04-04-github-mcp-server-public-preview/
https://github.blog/changelog/2025-04-04-github-mcp-server-public-preview/
The GitHub Blog
github-mcp-server is now available in public preview - GitHub Changelog
Today we’re releasing a new open source, official, local GitHub MCP Server. We’ve worked with Anthropic to rewrite their reference server in Go and improve its usability. The new server…
Microsoft has released a tool under MIT license to convert your docx into markdown.
https://github.com/microsoft/markitdown
https://github.com/microsoft/markitdown
GitHub
GitHub - microsoft/markitdown: Python tool for converting files and office documents to Markdown.
Python tool for converting files and office documents to Markdown. - microsoft/markitdown
OpenAI has just released a FOSS CLI tool for “developers who already live in the terminal and want ChatGPT‑level reasoning plus the power to actually run code, manipulate files, and iterate – all under version control”.
https://github.com/openai/codex
https://github.com/openai/codex
GitHub
GitHub - openai/codex: Lightweight coding agent that runs in your terminal
Lightweight coding agent that runs in your terminal - openai/codex
Babe wake up, Microsoft released a 1-bit LLM under MIT that is optimized for running on CPUs: microsoft/bitnet-b1.58-2B-4T
Salvatore Sanfilippo (aka antirez) is back!
After stepping away from Redis for some time, he's returned with a major contribution: a brand-new data type called vector sets. This addition brings semantic similarity search to Redis, making it possible to query based on meaning rather than exact matches.
Check it out: https://redis.io/blog/announcing-vector-sets-a-new-redis-data-type-for-vector-similarity
After stepping away from Redis for some time, he's returned with a major contribution: a brand-new data type called vector sets. This addition brings semantic similarity search to Redis, making it possible to query based on meaning rather than exact matches.
Check it out: https://redis.io/blog/announcing-vector-sets-a-new-redis-data-type-for-vector-similarity
Redis
Announcing vector sets, a new Redis data type for vector similarity | Redis
Developers love Redis. Unlock the full potential of the Redis database with Redis Enterprise and start building blazing fast apps.
A must-watch for anyone interested in the future of #AI. In this interview for NVIDIA Developer, Yann LeCun - Turing Award winner and Chief AI Scientist at Meta - shares his take (and a very robust contrarian opinion, IMHO) on the limits of today’s language models.
In the interview, he argues that LLMs (like OpenAI's GPT or Meta's LLaMA) are not the path to true artificial general intelligence. They're impressive, yes, but fundamentally constrained by the Transformer architecture. Scaling up won’t solve this. Why? Because human intelligence isn’t just about language or token prediction - it’s about understanding, reasoning, and interacting with the physical world through systems more akin to what psychologists call System 1 and System 2.
Think about a cat: it can leap with precision without any concept of physics, nor of the languages (mathematical and linguistic) which can explicitly explain physics. Ask your local orange alley cat about it, if you don't believe me.
LeCun also shares some striking numbers to highlight just how limited language-based learning really is:
• Human language processing has a very low data rate - roughly 12 bytes per second. That’s about 4.5 words per second, assuming each word is encoded in about 2 bytes.
• Vision operates on an entirely different scale. Our two optical nerves transmit a combined stream of roughly 20 megabytes per second, based on the million fibers in each nerve sending about 10 bytes per second.
• Over just four years of being awake, a child accumulates around a petabyte of visual experience - far more than the total training data of even the largest language models.
To put it plainly:
Visual perception carries around 16 million times more data than reading or listening to language, and a preschooler has already taken in 50 times more data than what goes into the largest text-trained LLMs.
LeCun’s takeaway is that the infrastructure modeling the data and the structure of data matter just as much as the quantity of data. Self-Supervised Learning thrives on redundancy, and sensory inputs (especially vision) are packed with the kind of statistical richness that language alone can’t provide.
Most of what we know - and certainly what animals know - comes from experience, not explanation. Language is a brilliant tool, but it's the final layer, not the foundation.
I explored this in my own way in Hangman and Circles, where I reported some results of a research by the Apple AI Team, reflected on the limits of linguistic abstraction, and talked about why it may be misleading us in the pursuit of AGI:
https://t.me/bytebaibyte/19
This interview is amazing, and super-fun too - definitely worth your time:
https://youtu.be/eyrDM3A_YFc?si=oMiDKJAXUYIjfjIu
PS. I also found watching the ~2h interview on the Lex Fridman Podcast extremely interesting - it dives deeper into the topics mentioned during the Nvidia Developer interview.
In the interview, he argues that LLMs (like OpenAI's GPT or Meta's LLaMA) are not the path to true artificial general intelligence. They're impressive, yes, but fundamentally constrained by the Transformer architecture. Scaling up won’t solve this. Why? Because human intelligence isn’t just about language or token prediction - it’s about understanding, reasoning, and interacting with the physical world through systems more akin to what psychologists call System 1 and System 2.
Think about a cat: it can leap with precision without any concept of physics, nor of the languages (mathematical and linguistic) which can explicitly explain physics. Ask your local orange alley cat about it, if you don't believe me.
LeCun also shares some striking numbers to highlight just how limited language-based learning really is:
• Human language processing has a very low data rate - roughly 12 bytes per second. That’s about 4.5 words per second, assuming each word is encoded in about 2 bytes.
• Vision operates on an entirely different scale. Our two optical nerves transmit a combined stream of roughly 20 megabytes per second, based on the million fibers in each nerve sending about 10 bytes per second.
• Over just four years of being awake, a child accumulates around a petabyte of visual experience - far more than the total training data of even the largest language models.
To put it plainly:
Visual perception carries around 16 million times more data than reading or listening to language, and a preschooler has already taken in 50 times more data than what goes into the largest text-trained LLMs.
LeCun’s takeaway is that the infrastructure modeling the data and the structure of data matter just as much as the quantity of data. Self-Supervised Learning thrives on redundancy, and sensory inputs (especially vision) are packed with the kind of statistical richness that language alone can’t provide.
Most of what we know - and certainly what animals know - comes from experience, not explanation. Language is a brilliant tool, but it's the final layer, not the foundation.
I explored this in my own way in Hangman and Circles, where I reported some results of a research by the Apple AI Team, reflected on the limits of linguistic abstraction, and talked about why it may be misleading us in the pursuit of AGI:
https://t.me/bytebaibyte/19
This interview is amazing, and super-fun too - definitely worth your time:
https://youtu.be/eyrDM3A_YFc?si=oMiDKJAXUYIjfjIu
PS. I also found watching the ~2h interview on the Lex Fridman Podcast extremely interesting - it dives deeper into the topics mentioned during the Nvidia Developer interview.
YouTube
Frontiers of AI and Computing: A Conversation With Yann LeCun and Bill Dally | NVIDIA GTC 2025
As artificial intelligence continues to reshape the world, the intersection of deep learning and high performance computing becomes increasingly crucial. This talk brings together Yann LeCun, a pioneer in deep learning and the chief AI scientist at Meta,…
Byte by Byte
A must-watch for anyone interested in the future of #AI. In this interview for NVIDIA Developer, Yann LeCun - Turing Award winner and Chief AI Scientist at Meta - shares his take (and a very robust contrarian opinion, IMHO) on the limits of today’s language…
also relevant: https://arxiv.org/abs/2503.21934
arXiv.org
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Recent math benchmarks for large language models (LLMs) such as MathArena indicate that state-of-the-art reasoning models achieve impressive performance on mathematical competitions like AIME,...
Who would have thought we'd see PewDiePie advocating for Linux on the desktop before GTA 6?
YouTube
I installed Linux (so should you)
#ad - Shop Gfuel sale: https://creator.gfuel.com/pewdiepie
🧎#Subscribe🧎
🌎 Get an exclusive 15% discount on Saily data plans! Use code pewdiepie at checkout. Download Saily app or go to https://saily.com/pewdiepie ⛵
📰 Get "The Kjellberg Mail" (family newsletter…
🧎#Subscribe🧎
🌎 Get an exclusive 15% discount on Saily data plans! Use code pewdiepie at checkout. Download Saily app or go to https://saily.com/pewdiepie ⛵
📰 Get "The Kjellberg Mail" (family newsletter…
In the upcoming Ubuntu 25.10, Canonical plans to use an alternative to sudo that's being developed by the sudo-rs project and written in Rust. In March, a similar decision was made to replace GNU Coreutils with uutils, which is also written in Rust. There are currently initiatives under consideration to replace zlib and ntpd with zlib-rs and ntpd-rs.
https://www.phoronix.com/news/Ubuntu-25.10-sudo-rs-Default
https://www.phoronix.com/news/Ubuntu-25.10-sudo-rs-Default
Phoronix
Ubuntu 25.10 Plans To Use sudo-rs By Default For Memory-Safe, Rust-Based sudo
With Ubuntu 25.10 Canonical is planning to make use of more Rust-written system components and so far most of that talk has been about transitioning to Rust Coreutils 'uutils' in place of GNU Coreutils
via Valentina Lenarduzzi on LinkedIn:
"Our paper "Does #Microservices Adoption Impact the Velocity? A #Cohort Study" has been accepted at Empirical Software Engineering Journal
Microservices are often praised for improving development speed thanks to their modular and independent nature. But do they actually lead to faster feature delivery and bug fixing? In our latest study, we explored this question using a retrospective #Cohort design - a methodology widely used in medical research but still rare in software engineering.
What we did: We conducted the first large-scale empirical study comparing GitHub projects built with #Microservices from the start against similar monolithic projects, using a #Cohort study to assess causality-not just correlation.
What we found: Surprisingly, no statistically significant difference in development velocity was observed. Even after controlling for confounding variables, #Microservices adoption didn't show a measurable impact on how quickly projects deliver features or fix bugs.
Why it matters: This study not only challenges assumptions about #Microservices and velocity, but also introduces a powerful empirical methodology to our field. We're excited to contribute one of the first works applying cohort studies in software engineering research.
https://www.researchgate.net/publication/391482952_Does_Microservice_Adoption_Impact_the_Velocity_A_Cohort_Study
"Our paper "Does #Microservices Adoption Impact the Velocity? A #Cohort Study" has been accepted at Empirical Software Engineering Journal
Microservices are often praised for improving development speed thanks to their modular and independent nature. But do they actually lead to faster feature delivery and bug fixing? In our latest study, we explored this question using a retrospective #Cohort design - a methodology widely used in medical research but still rare in software engineering.
What we did: We conducted the first large-scale empirical study comparing GitHub projects built with #Microservices from the start against similar monolithic projects, using a #Cohort study to assess causality-not just correlation.
What we found: Surprisingly, no statistically significant difference in development velocity was observed. Even after controlling for confounding variables, #Microservices adoption didn't show a measurable impact on how quickly projects deliver features or fix bugs.
Why it matters: This study not only challenges assumptions about #Microservices and velocity, but also introduces a powerful empirical methodology to our field. We're excited to contribute one of the first works applying cohort studies in software engineering research.
https://www.researchgate.net/publication/391482952_Does_Microservice_Adoption_Impact_the_Velocity_A_Cohort_Study
ResearchGate
(PDF) Does microservice adoption impact the velocity? A cohort study
PDF | Context] Microservices enable the decomposition of applications into small, independent, and connected services. The independence between services... | Find, read and cite all the research you need on ResearchGate