Round Numbers
I really don’t want to push my luck, but it seems we’ve reached a perfectly round number of subscribers. So let me tell you a story.
Many, many years ago, an Indian shah was bored. Then a wise man came and presented him with the game of chess. The shah was thrilled and offered the man anything he wanted. The wise man asked for as much rice as the shah could place on a chessboard using the following rule: on the first square, put one grain of rice; on the second, two grains; and so on. Each next square should contain twice as many grains as the previous one.
I don’t actually know how this story ends because, obviously, 2**64 is quite a big number, and the shah could not possibly give the wise man everything he had asked for.
But this story gives us exactly the picture I started this post with.
Yesterday our community covered one row of this proverbial chessboard. Today — one cell of the next row. That’s the strange property of fast-growing populations.
As far as I remember, about 30% of all people who have ever lived are alive right now. So if you hear the joke “100% of people who eat cucumbers died,” don’t trust it. No more than 70%.
I really don’t want to push my luck, but it seems we’ve reached a perfectly round number of subscribers. So let me tell you a story.
Many, many years ago, an Indian shah was bored. Then a wise man came and presented him with the game of chess. The shah was thrilled and offered the man anything he wanted. The wise man asked for as much rice as the shah could place on a chessboard using the following rule: on the first square, put one grain of rice; on the second, two grains; and so on. Each next square should contain twice as many grains as the previous one.
I don’t actually know how this story ends because, obviously, 2**64 is quite a big number, and the shah could not possibly give the wise man everything he had asked for.
But this story gives us exactly the picture I started this post with.
Yesterday our community covered one row of this proverbial chessboard. Today — one cell of the next row. That’s the strange property of fast-growing populations.
As far as I remember, about 30% of all people who have ever lived are alive right now. So if you hear the joke “100% of people who eat cucumbers died,” don’t trust it. No more than 70%.
👍1🤔1
School of Data Analysis. Agent intensive
There is an informational storm in the channels I read about agentic programming. The School of Data Analysis released a free course on agentic programming, and a lot of people are discussing it. Mostly, they are trying to find gaps in it.
For example:
🐞 Polyakov, tools and MCP or
🐞 Kovalsky, on Polyakov
The idea of this channel is to share things from my perspective, so I want to share my conspect of the course. These are the things that made a difference for me personally, given my current background in this area.
Let's start.
Agent
It's a totally unstable concept. Of course, you need an LLM (large language model), some prompts to guide it, tools to call, memory, guardrails (for safety), and planning skills (to fill the gap in the LLM's ability to make plans).
For me, guardrails and planning skills are the most interesting things to hear about.
LLM
The lecture said that an LLM is basically two files. One huge file with parameters, and one small program that runs it. For me, this statement is important because it demystifies the technology. Just two files. That's it.
Karpaty's LLM OS
I had read about this idea several times before. But now it really clicked.
In an agent, the LLM works like a CPU on a motherboard. It processes data in different modalities, acts through tools, and performs Input/Output operations. This "OS" thing sounds wrong to me. An OS is the first program that starts on your computer when you turn it on. A CPU on a motherboard suits this analogy much better.
Special tokens
You know, an LLM can't see letters. It sees tokens. Each token is a group of letters. This helps optimize both training and inference.
I already knew that. But for me there was still a gap between the JSON I send to the LLM API and the array of tokens actually fed into the model. There is one element that makes this gap narrower: special tokens.
I knew there were special tokens to start and stop generation. But it turned out that there are also tokens for roles in conversations and for actions, like text translation.
There is an informational storm in the channels I read about agentic programming. The School of Data Analysis released a free course on agentic programming, and a lot of people are discussing it. Mostly, they are trying to find gaps in it.
For example:
🐞 Polyakov, tools and MCP or
🐞 Kovalsky, on Polyakov
The idea of this channel is to share things from my perspective, so I want to share my conspect of the course. These are the things that made a difference for me personally, given my current background in this area.
Let's start.
Agent
It's a totally unstable concept. Of course, you need an LLM (large language model), some prompts to guide it, tools to call, memory, guardrails (for safety), and planning skills (to fill the gap in the LLM's ability to make plans).
For me, guardrails and planning skills are the most interesting things to hear about.
LLM
The lecture said that an LLM is basically two files. One huge file with parameters, and one small program that runs it. For me, this statement is important because it demystifies the technology. Just two files. That's it.
Karpaty's LLM OS
I had read about this idea several times before. But now it really clicked.
In an agent, the LLM works like a CPU on a motherboard. It processes data in different modalities, acts through tools, and performs Input/Output operations. This "OS" thing sounds wrong to me. An OS is the first program that starts on your computer when you turn it on. A CPU on a motherboard suits this analogy much better.
Special tokens
You know, an LLM can't see letters. It sees tokens. Each token is a group of letters. This helps optimize both training and inference.
I already knew that. But for me there was still a gap between the JSON I send to the LLM API and the array of tokens actually fed into the model. There is one element that makes this gap narrower: special tokens.
I knew there were special tokens to start and stop generation. But it turned out that there are also tokens for roles in conversations and for actions, like text translation.
👍1
273
My head is about to break because of the School of Data Analysis course. Diagrams of interactions between MCP components give me nightmares. Let's talk about constants in physics.
Everyone knows that absolute zero is approximately -273 degrees Celsius. But what is the source of this constant? Is it experimental or theoretical?
A piece of totally impractical, but dear to me, knowledge is the following. If we take the melting point of water and its boiling point as reference points, divide the whole range into 100 equal parts using an expanding liquid like mercury as a measure, then a decrease in temperature by 1 degree Celsius leads to the gas shrinking by approximately 1/273 of its volume at 0 degrees Celsius. So absolute zero is the point at which the gas shrinks to nothing.
Phew. So much easier than two stage-embedding retrieval.
My head is about to break because of the School of Data Analysis course. Diagrams of interactions between MCP components give me nightmares. Let's talk about constants in physics.
Everyone knows that absolute zero is approximately -273 degrees Celsius. But what is the source of this constant? Is it experimental or theoretical?
A piece of totally impractical, but dear to me, knowledge is the following. If we take the melting point of water and its boiling point as reference points, divide the whole range into 100 equal parts using an expanding liquid like mercury as a measure, then a decrease in temperature by 1 degree Celsius leads to the gas shrinking by approximately 1/273 of its volume at 0 degrees Celsius. So absolute zero is the point at which the gas shrinks to nothing.
Phew. So much easier than two stage-embedding retrieval.
👍2
Shad Intensive. Memory. Guardrails.
Just want to share that I'm listening to this course. With a delay, but I'm trying to eat this mammoth piece by piece.
Today, in the "Memory and Guardrails" lecture, I didn't hear anything that gave me an insight I would like to share. Just common words about context length and context compaction in the memory part. In the guardrails part, they mentioned sources of danger like every user input, RAG, and API. I believe this is the usual computer security paranoia: you can't trust anyone. And it's much better to turn your computer off, drop it in liquid cement, and let it set.
The only thing that really interests me is not what I understood, but what I didn't. Surprisingly, I don't quite get this "context window" concept. Probably it's just my hallucination. If not, and if there is something interesting here, I'll share it.
Just want to share that I'm listening to this course. With a delay, but I'm trying to eat this mammoth piece by piece.
Today, in the "Memory and Guardrails" lecture, I didn't hear anything that gave me an insight I would like to share. Just common words about context length and context compaction in the memory part. In the guardrails part, they mentioned sources of danger like every user input, RAG, and API. I believe this is the usual computer security paranoia: you can't trust anyone. And it's much better to turn your computer off, drop it in liquid cement, and let it set.
The only thing that really interests me is not what I understood, but what I didn't. Surprisingly, I don't quite get this "context window" concept. Probably it's just my hallucination. If not, and if there is something interesting here, I'll share it.
Shad Intensive. Evaluation.
It seems I’ll be eating this mammoth in small pieces for ages. So for now, here’s a link to a nice post on agent evaluation
It seems I’ll be eating this mammoth in small pieces for ages. So for now, here’s a link to a nice post on agent evaluation
Telegram
Поляков считает: AI, код и кейсы
Как тестировать AI-агентов: на полях лекций в ШАД
Продолжаю Agents Week от ШАД. Четвёртая лекция — как проверять качество агентов. Тема, которую все откладывают и которая больше всего бьёт по репутации ИИ в проде или интегратора.
📋 Что советует лекция
…
Продолжаю Agents Week от ШАД. Четвёртая лекция — как проверять качество агентов. Тема, которую все откладывают и которая больше всего бьёт по репутации ИИ в проде или интегратора.
📋 Что советует лекция
…
Are we doing meme shitposting today?
Anonymous Poll
81%
Memes! Yeah!
6%
I'll unsubscribe immediately
0%
I've already unsubscribed
25%
TGIF!!!
TGIF. Meme
First of all, let's check the results of the poll. One subscriber promised to unsubscribe, and the other 9 voted for memes and shitposting. A naive approach would be to think that if I publish the meme, I'll lose 1 subscriber. But you can solve the proportion, and it gives -27.6 subscribers. The result is stunning, so, let's see. I expect to drop to 248.4 subscribers.
The topic of today's Friday meme is The Soup.
First of all. I stole these memes from The Wizard . I think that it is a magnificent channel and I ask you to promote it as widely as you can.
And now to the chase.
Dad's soup
My dad cooks absolutely hellish food.
It’s a sort of averaged recipe, because there are lots of variations.
He takes soup — but reheating it is not my dad’s style.
He pours the soup into a frying pan and starts frying it.
He adds a huge amount of onion, garlic, tomato paste, flour for thickness, and mayonnaise on top.
The whole thing fries until smoke starts coming out.
Then he takes it off the heat, lets it cool on the balcony, brings it back, pours on even more mayonnaise, and starts eating.
He eats straight from the pan, scraping it with a spoon, muttering under his breath, “oh, damn.”
Sweat is standing on his forehead.
Sometimes he politely offers me some, but I refuse.
Needless to say, the aftermath is monstrous.
The stench is so intense that the wallpaper peels off the walls.
P.S. To be honest, all this subscribe/unsubscribe stuff is starting to get to me. I’d really appreciate some support — even a couple of emojis wouldn’t hurt.
P.P.S. Tomorrow I’ll try to pull myself together and write something clever. Probably continue the “Titanic” line.
First of all, let's check the results of the poll. One subscriber promised to unsubscribe, and the other 9 voted for memes and shitposting. A naive approach would be to think that if I publish the meme, I'll lose 1 subscriber. But you can solve the proportion, and it gives -27.6 subscribers. The result is stunning, so, let's see. I expect to drop to 248.4 subscribers.
The topic of today's Friday meme is The Soup.
First of all. I stole these memes from The Wizard . I think that it is a magnificent channel and I ask you to promote it as widely as you can.
And now to the chase.
Dad's soup
My dad cooks absolutely hellish food.
It’s a sort of averaged recipe, because there are lots of variations.
He takes soup — but reheating it is not my dad’s style.
He pours the soup into a frying pan and starts frying it.
He adds a huge amount of onion, garlic, tomato paste, flour for thickness, and mayonnaise on top.
The whole thing fries until smoke starts coming out.
Then he takes it off the heat, lets it cool on the balcony, brings it back, pours on even more mayonnaise, and starts eating.
He eats straight from the pan, scraping it with a spoon, muttering under his breath, “oh, damn.”
Sweat is standing on his forehead.
Sometimes he politely offers me some, but I refuse.
Needless to say, the aftermath is monstrous.
The stench is so intense that the wallpaper peels off the walls.
P.S. To be honest, all this subscribe/unsubscribe stuff is starting to get to me. I’d really appreciate some support — even a couple of emojis wouldn’t hurt.
P.P.S. Tomorrow I’ll try to pull myself together and write something clever. Probably continue the “Titanic” line.
❤4😁3🔥1
Titanic: family bonds
Nowadays, it is hard to imagine an ML person without a Jupyter notebook. So let’s think a little outside the box and consider other options.
Quite recently, I discovered an interesting option for CSV table analysis. It consists of two instruments: DBeaver and DuckDB. The former is a Swiss army knife for database access, and the latter is a plugin that is quite powerful, or so I was told, but can also be used to open CSV files.
So, you can create a new DuckDB connection in DBeaver (ducks and beavers, yeah...), select the titanic.csv file you got from Kaggle, and start running queries.
SibSp is a factor, so to speak, horizontal in the ancestry tree. It is the sum of siblings and spouses. Parch is vertical in these coordinates: parents and children.
The first two queries show us the strength of these factors separately. The third uses the nested query technique to produce a derived factor, family size. One can see that these factors are useful, but the derived factor gives a stronger depletion/enrichment effect. This time, we were lucky enough to engineer a new feature.
I would say that family sizes 2, 3, and 4 vote for survival; the others vote against it.
Nowadays, it is hard to imagine an ML person without a Jupyter notebook. So let’s think a little outside the box and consider other options.
Quite recently, I discovered an interesting option for CSV table analysis. It consists of two instruments: DBeaver and DuckDB. The former is a Swiss army knife for database access, and the latter is a plugin that is quite powerful, or so I was told, but can also be used to open CSV files.
So, you can create a new DuckDB connection in DBeaver (ducks and beavers, yeah...), select the titanic.csv file you got from Kaggle, and start running queries.
SibSp is a factor, so to speak, horizontal in the ancestry tree. It is the sum of siblings and spouses. Parch is vertical in these coordinates: parents and children.
The first two queries show us the strength of these factors separately. The third uses the nested query technique to produce a derived factor, family size. One can see that these factors are useful, but the derived factor gives a stronger depletion/enrichment effect. This time, we were lucky enough to engineer a new feature.
I would say that family sizes 2, 3, and 4 vote for survival; the others vote against it.
👍1
Let’s rock ROC
All this time, it bothered me that the ROC in the post contains only three points. I’ve been thinking about this situation for a while, and I think I’ve found a small but genuinely new idea. I’m going to explain it slowly in a series of upcoming posts. But first, I want to check whether this thought is actually trivial. Please vote in the poll below, and feel free to leave comments.
Quick recap:
We have a model that gives scores of -1, 0, and 1. When we plot the ROC curve, we get a graph with 4 points and an AUROC of 56%.
The question is: can we calculate the dispersion of AUROC in this case?
All this time, it bothered me that the ROC in the post contains only three points. I’ve been thinking about this situation for a while, and I think I’ve found a small but genuinely new idea. I’m going to explain it slowly in a series of upcoming posts. But first, I want to check whether this thought is actually trivial. Please vote in the poll below, and feel free to leave comments.
Quick recap:
We have a model that gives scores of -1, 0, and 1. When we plot the ROC curve, we get a graph with 4 points and an AUROC of 56%.
The question is: can we calculate the dispersion of AUROC in this case?
Telegram
Algorithms. Physics. Mathematics. Machine Learning.
Titanic. Age.
We started talking about the Titanic dataset. Let's discuss the Age factor. We saw that Sex, Pclass and Embarked are strong features and to study them we used that these features are categorical with low cardinality. Age is different. It's…
We started talking about the Titanic dataset. Let's discuss the Age factor. We saw that Sex, Pclass and Embarked are strong features and to study them we used that these features are categorical with low cardinality. Age is different. It's…
Can we calculate the dispersion of AUROC in this case?
Anonymous Poll
18%
Of course, it's 0. Obviously.
55%
AUROC is a metrics, it has no dispersion
36%
Your question is both stupid and offending
27%
Ouch. I think I have an idea. (And will share in comments)
27%
Shut up and post more memes.