2025 in a nutshell. Kinda...
As the owner of this channel and its promoter, I hang out in a lot of Telegram channels. Now there are many posts with promises for the next year. I definitely don’t want to make the gods laugh, so no promises. The tones vary from very proud (old school) to overly humble (an attempt to follow trends and be original). Unfortunately, all this started a process in my brain, and here we are.
Picture. Activity per week on vocabulary.com in terms of “mastered words”. It doesn’t mean I know all these words already, but at least my brain was exposed to them. It’s my longest streak on the platform, and I hope to finish the first epoch by summer 2026.
I taught a two-semester ML course at Russian-Armenian University (RAU) and expelled one student from it. Two students from my group asked me to be their thesis supervisor → +2 defended graduates under my supervision. A similar story with the School of Data Analysis: two students from MIPT → +2 graduated students with “excellent” marks. Quite rare for MIPT.
Overall: 5 graduated students → 9. +80%.
Probably because my focus drifted from my main work toward educational activities, I was thrown out of the Yandex Education department. Nevertheless, the project on metrics for educational projects was completed. In three months, I came up with quality metrics and orchestrated the work of six people.
I landed on my feet in another department after four months of an undefined state. Now I can proudly say: “I work at Yandex Delivery.”
That’s it for my solo part. My wife and I successfully found a position for her as a prompt engineer. A pinch of luck and a year of hard work.
And this channel. I started it as an exercise in English. But the more I think about it, the more it looks like an attempt to build a community. There are no random people here. Now we have 81 souls who like computers, programming, math, and education. Thank you very much for being here. Feel free to communicate — I’ve opened a chat connected to the channel.
I don’t think the coming year will be easy, but I hope it will be interesting. And I hope we can all support each other with our knowledge and total awesomeness.
As the owner of this channel and its promoter, I hang out in a lot of Telegram channels. Now there are many posts with promises for the next year. I definitely don’t want to make the gods laugh, so no promises. The tones vary from very proud (old school) to overly humble (an attempt to follow trends and be original). Unfortunately, all this started a process in my brain, and here we are.
Picture. Activity per week on vocabulary.com in terms of “mastered words”. It doesn’t mean I know all these words already, but at least my brain was exposed to them. It’s my longest streak on the platform, and I hope to finish the first epoch by summer 2026.
I taught a two-semester ML course at Russian-Armenian University (RAU) and expelled one student from it. Two students from my group asked me to be their thesis supervisor → +2 defended graduates under my supervision. A similar story with the School of Data Analysis: two students from MIPT → +2 graduated students with “excellent” marks. Quite rare for MIPT.
Overall: 5 graduated students → 9. +80%.
Probably because my focus drifted from my main work toward educational activities, I was thrown out of the Yandex Education department. Nevertheless, the project on metrics for educational projects was completed. In three months, I came up with quality metrics and orchestrated the work of six people.
I landed on my feet in another department after four months of an undefined state. Now I can proudly say: “I work at Yandex Delivery.”
That’s it for my solo part. My wife and I successfully found a position for her as a prompt engineer. A pinch of luck and a year of hard work.
And this channel. I started it as an exercise in English. But the more I think about it, the more it looks like an attempt to build a community. There are no random people here. Now we have 81 souls who like computers, programming, math, and education. Thank you very much for being here. Feel free to communicate — I’ve opened a chat connected to the channel.
I don’t think the coming year will be easy, but I hope it will be interesting. And I hope we can all support each other with our knowledge and total awesomeness.
❤2💯1
Water jets
In my childhood library there were many DIY books. One of them, written for young pioneers, described a small setup that demonstrates how jet velocity depends on pressure.
At some point I managed to find three tin cans, cut off their lids, and weld them into a single tube. But for some reason the project never went any further. I can’t recall the exact cause. Most likely I failed to make the seams perfectly watertight, and the perfectionist inside me didn’t allow the experiment to be finished. Which is a pity — now I understand that the setup was literally one step away from completion, and the seams didn’t need to be completely watertight in the first place.
The column survived for decades, but after many moves — including our recent relocation to Yerevan — it was lost for good. Now, with a 3D printer at hand, I want to recreate this toy. So far, however, there are no real photos of the device.
The picture at the beginning of this article is pure neuroslop. What exactly is wrong with it? Share your thoughts in the comments.
In my childhood library there were many DIY books. One of them, written for young pioneers, described a small setup that demonstrates how jet velocity depends on pressure.
At some point I managed to find three tin cans, cut off their lids, and weld them into a single tube. But for some reason the project never went any further. I can’t recall the exact cause. Most likely I failed to make the seams perfectly watertight, and the perfectionist inside me didn’t allow the experiment to be finished. Which is a pity — now I understand that the setup was literally one step away from completion, and the seams didn’t need to be completely watertight in the first place.
The column survived for decades, but after many moves — including our recent relocation to Yerevan — it was lost for good. Now, with a 3D printer at hand, I want to recreate this toy. So far, however, there are no real photos of the device.
The picture at the beginning of this article is pure neuroslop. What exactly is wrong with it? Share your thoughts in the comments.
🔥1
Pouring-out velocity
When solving physics problems at school, I used a standard formula for the velocity of a jet pouring out of a hole, given the pressure inside the liquid and its density. But for a long time I couldn’t really derive this expression myself. And now I understand why.
The picture was inconsistent.
The equations themselves are trivial. On one hand, one can write energy as 𝑝𝑉. On the other hand, kinetic energy is ½ m v². So one writes pV = ½ m v². Using ρ=m/V, we immediately get a nice expression for the jet velocity (the whole thing is in the picture).
The problem is conceptual. There is no such volume whose one side is under pressure 𝑝, whose other side is under pressure 0, and which simultaneously accelerates from 𝑣=0 to v=v0. In school and even at university I felt this discrepancy intuitively, and because of that the derivation always felt like a trick rather than understanding.
Twenty-five years later, something suddenly clicked. It seems, I got it. An illusion, of course. Let me share it.
First: the streamtube.
A streamtube is an imaginary tube with an inlet, an outlet, and walls. The walls are formed by streamlines — curves tangent to the velocity of the fluid at every point. This construction is powerful because it lets us write an energy balance between inlet and outlet.
Crucially, there is no work done on the walls: pressure forces are perpendicular to the walls, motion is tangent to them, and the dot product of perpendicular vectors is zero.
So what is really going on?
Deep inside the liquid, the fluid is almost at rest and the pressure is maximal: 𝑣=0, 𝑝=𝑝0. At the center of the hole, the pressure is zero and the fluid has some non-zero velocity 𝑣, which we want to find.
One quantity smoothly transforms into the other as we move from the bulk of the liquid toward the hole. Let’s consider a streamtube that goes from the center of the liquid volume to the hole. Now everything makes sense.
The inlet of the streamtube is much wider than the outlet. That automatically gives us (almost) zero velocity at the inlet and a finite velocity at the outlet. We are no longer talking about a tiny abstract volume — we are considering the entire volume of liquid inside the streamtube.
This volume accelerates under the action of pressure gradients. It plays two roles at once:
◦it is a body that accelerates under force, and
◦it is a lever that multiplies liquid velocity by the ratio of inlet to outlet cross-section.
The resulting picture is a bit crazy: a mushroom-like shape, with many tubes narrowing from the “hat” to the “leg.” Liquid accelerates along these tubes; work done at the beginning is exactly balanced by work at the end of each tube. Complex — but internally consistent.
Basically, we have followed the university-level route that usually leads to the Bernoulli integral — but done it at a school level. And now we have a “magic-free” derivation of the pouring-out jet velocity.
And this is just moving liquid. In the simplest possible case: liquid pouring out of a hole.
But then come vortices. Plasma. Mass transfer coupled with electromagnetic fields…
Yummy craziness.
When solving physics problems at school, I used a standard formula for the velocity of a jet pouring out of a hole, given the pressure inside the liquid and its density. But for a long time I couldn’t really derive this expression myself. And now I understand why.
The picture was inconsistent.
The equations themselves are trivial. On one hand, one can write energy as 𝑝𝑉. On the other hand, kinetic energy is ½ m v². So one writes pV = ½ m v². Using ρ=m/V, we immediately get a nice expression for the jet velocity (the whole thing is in the picture).
The problem is conceptual. There is no such volume whose one side is under pressure 𝑝, whose other side is under pressure 0, and which simultaneously accelerates from 𝑣=0 to v=v0. In school and even at university I felt this discrepancy intuitively, and because of that the derivation always felt like a trick rather than understanding.
Twenty-five years later, something suddenly clicked. It seems, I got it. An illusion, of course. Let me share it.
First: the streamtube.
A streamtube is an imaginary tube with an inlet, an outlet, and walls. The walls are formed by streamlines — curves tangent to the velocity of the fluid at every point. This construction is powerful because it lets us write an energy balance between inlet and outlet.
Crucially, there is no work done on the walls: pressure forces are perpendicular to the walls, motion is tangent to them, and the dot product of perpendicular vectors is zero.
So what is really going on?
Deep inside the liquid, the fluid is almost at rest and the pressure is maximal: 𝑣=0, 𝑝=𝑝0. At the center of the hole, the pressure is zero and the fluid has some non-zero velocity 𝑣, which we want to find.
One quantity smoothly transforms into the other as we move from the bulk of the liquid toward the hole. Let’s consider a streamtube that goes from the center of the liquid volume to the hole. Now everything makes sense.
The inlet of the streamtube is much wider than the outlet. That automatically gives us (almost) zero velocity at the inlet and a finite velocity at the outlet. We are no longer talking about a tiny abstract volume — we are considering the entire volume of liquid inside the streamtube.
This volume accelerates under the action of pressure gradients. It plays two roles at once:
◦it is a body that accelerates under force, and
◦it is a lever that multiplies liquid velocity by the ratio of inlet to outlet cross-section.
The resulting picture is a bit crazy: a mushroom-like shape, with many tubes narrowing from the “hat” to the “leg.” Liquid accelerates along these tubes; work done at the beginning is exactly balanced by work at the end of each tube. Complex — but internally consistent.
Basically, we have followed the university-level route that usually leads to the Bernoulli integral — but done it at a school level. And now we have a “magic-free” derivation of the pouring-out jet velocity.
And this is just moving liquid. In the simplest possible case: liquid pouring out of a hole.
But then come vortices. Plasma. Mass transfer coupled with electromagnetic fields…
Yummy craziness.
Not exponential. This time.
Guys, there’s always an exponential lurking somewhere. When the rate is proportional to the amount, the amount decays exponentially: a capacitor discharging through a resistor, detergent washing out of your sweater, foul odor leaving the room when you open a window.
But not this time.
In the previous post we derived the velocity of a jet flowing out of a hole given the pressure and the liquid density. And instead of the usual “velocity is proportional to the quantity” we get a small correction: it’s proportional to the square root of the quantity. A tiny change — and the whole solution behaves differently. Now Achilles can finally reach the tortoise in a finite time.
It always fascinated me that exponential decay is kind of contradictory. It’s blazingly fast (geometric progression!), and at the same time infinitely slow — because it never reaches exactly zero.
Here, when the outflow isn’t limited by viscosity but by inertia, the exponential law turns into a parabola — and the liquid leaves the vessel in finite time.
In the picture you can see the full derivation.
I totally forgot this funny fact from university physics. Thank you, Nikita, for bringing up this question in our conversation!
Guys, there’s always an exponential lurking somewhere. When the rate is proportional to the amount, the amount decays exponentially: a capacitor discharging through a resistor, detergent washing out of your sweater, foul odor leaving the room when you open a window.
But not this time.
In the previous post we derived the velocity of a jet flowing out of a hole given the pressure and the liquid density. And instead of the usual “velocity is proportional to the quantity” we get a small correction: it’s proportional to the square root of the quantity. A tiny change — and the whole solution behaves differently. Now Achilles can finally reach the tortoise in a finite time.
It always fascinated me that exponential decay is kind of contradictory. It’s blazingly fast (geometric progression!), and at the same time infinitely slow — because it never reaches exactly zero.
Here, when the outflow isn’t limited by viscosity but by inertia, the exponential law turns into a parabola — and the liquid leaves the vessel in finite time.
In the picture you can see the full derivation.
I totally forgot this funny fact from university physics. Thank you, Nikita, for bringing up this question in our conversation!
🔥1
New school level physics problem
There is an old physics problem: two holes in the wall of a jar with liquid. The depth of the first hole is x, the second is y. You are to find: the horizontal distance from the jar wall to the point where the two jets intersect, and the vertical distance from the liquid surface to that intersection point.
It’s an ancient problem — you can find versions of it already in Torricelli’s works, where he derived the expression for jet velocity.
But if you slightly extend it — add one more hole at depth z, assume all holes have the same cross-section, say that the two jets merge after they meet, and then ask for the intersection point of the merged jet with the third one — hurray! You get a new physics problem, one that (as far as I know) isn’t in schoolbooks yet.
The answer to this new problem is not as simple and elegant as in the original one, but it’s still totally doable. And it’s also a nice excuse to discuss which conservation laws you can apply to the “merging” process — and which ones you can’t, and why.
There is an old physics problem: two holes in the wall of a jar with liquid. The depth of the first hole is x, the second is y. You are to find: the horizontal distance from the jar wall to the point where the two jets intersect, and the vertical distance from the liquid surface to that intersection point.
It’s an ancient problem — you can find versions of it already in Torricelli’s works, where he derived the expression for jet velocity.
But if you slightly extend it — add one more hole at depth z, assume all holes have the same cross-section, say that the two jets merge after they meet, and then ask for the intersection point of the merged jet with the third one — hurray! You get a new physics problem, one that (as far as I know) isn’t in schoolbooks yet.
The answer to this new problem is not as simple and elegant as in the original one, but it’s still totally doable. And it’s also a nice excuse to discuss which conservation laws you can apply to the “merging” process — and which ones you can’t, and why.
For TFWR solutions discussions
In comments to this message we are conducting code review for The Farmer Was Replaced programs
In comments to this message we are conducting code review for The Farmer Was Replaced programs
NotebookLM
I’m working on a presentation about gradient-boosted decision trees. It’s definitely not my first rodeo: I’ve made decks in LaTeX/Beamer, Word, PowerPoint… probably a few other things too.
These days it feels almost wrong not to use AI tools for a task like this. After a bit of “deep research” I ended up with a shortlist of things to try, and decided to focus on NotebookLM.
To be fair, NotebookLM wasn’t my starting point. I’d already spent some time in Visual Studio Code writing a gbdte.md file and collecting visuals from old presentations, Jupyter notebooks, and some fresh sketches.
The first attempt was surprisingly good. I uploaded my materials into the left panel, found the Slide deck button on the right, clicked it — and got an automatically generated presentation that looked nice, had a consistent style, and the English was pretty decent.
The downside: everything is basically baked into images. You can’t tweak a single formula, fix one label, or move one arrow. In NotebookLM the only real control knob is the prompt — so you rewrite the prompt and pull the one-armed bandit lever again. I tried that a few times and didn’t like where it went.
So my final workflow is… kind of dumb. I screenshot slides from the NotebookLM deck, edit them in GIMP if needed, and paste the results into Google Slides.
I honestly don’t know if this is faster than building the presentation carefully, piece by piece, the way I did before the AGI era. But it’s a bit more fun — and it’s something you can do when you’re slightly tired, when the “proper” workflow feels like too much.
I’m working on a presentation about gradient-boosted decision trees. It’s definitely not my first rodeo: I’ve made decks in LaTeX/Beamer, Word, PowerPoint… probably a few other things too.
These days it feels almost wrong not to use AI tools for a task like this. After a bit of “deep research” I ended up with a shortlist of things to try, and decided to focus on NotebookLM.
To be fair, NotebookLM wasn’t my starting point. I’d already spent some time in Visual Studio Code writing a gbdte.md file and collecting visuals from old presentations, Jupyter notebooks, and some fresh sketches.
The first attempt was surprisingly good. I uploaded my materials into the left panel, found the Slide deck button on the right, clicked it — and got an automatically generated presentation that looked nice, had a consistent style, and the English was pretty decent.
The downside: everything is basically baked into images. You can’t tweak a single formula, fix one label, or move one arrow. In NotebookLM the only real control knob is the prompt — so you rewrite the prompt and pull the one-armed bandit lever again. I tried that a few times and didn’t like where it went.
So my final workflow is… kind of dumb. I screenshot slides from the NotebookLM deck, edit them in GIMP if needed, and paste the results into Google Slides.
I honestly don’t know if this is faster than building the presentation carefully, piece by piece, the way I did before the AGI era. But it’s a bit more fun — and it’s something you can do when you’re slightly tired, when the “proper” workflow feels like too much.
EGBDT Logloss - Learning curves
There were already two posts about the synthetic LogLoss dataset. The latest. Let's discuss an experiment with this dataset.
The dataset has two groups of static features:
📈f1…f8: features with increasing uplift
📉 f9..f16: features with decreasing uplift
And there are extra features [1, t] to capture bias and trend.
Now to the picture. This is a learning curve: the dependence of loss on the number of stages (i.e., how many trees are already in the model). I drew this plot mostly for debugging. I expected the loss to drop for the first 16 steps, and my initial results didn’t match because of a few bugs. Now the curves look OK at first glance — but there are a couple of interesting details worth staring at.
Loss drops on train for steps 1…16 — then stops
For points 1…16, the train loss steadily goes down. After that it mostly stops — which is exactly what I expected.
At each stage I’m using a decision stump (a tree of height 1). Such a tree effectively uses one feature per step. Each new feature can add new information and reduce the loss. Once the useful variables are exhausted, there’s nothing left to squeeze out.
The train–test gap is huge
What I don’t fully understand is the big gap between train and test. It looks like overfitting.
It might be interesting to run the same setup with different parameters and see whether the gap can be reduced (learning rate / regularization / subsampling / minimum leaf size — all the usual knobs).
A weird flat segment on the test curve around steps 8→9
Another thing: on train, the loss decreases at each step. But on test, there’s an almost horizontal segment between the 8th and 9th points. Why?
My first guess: the first 8 trees mostly exploit one group of features, and around step 9 the model “switches” and starts using the other group for the first time. But then the question becomes: why do those features generalize worse? Are they weaker, noisier, more correlated, or do they interact with the train/test split in a strange way?
So many interesting questions.
There were already two posts about the synthetic LogLoss dataset. The latest. Let's discuss an experiment with this dataset.
The dataset has two groups of static features:
📈f1…f8: features with increasing uplift
📉 f9..f16: features with decreasing uplift
And there are extra features [1, t] to capture bias and trend.
Now to the picture. This is a learning curve: the dependence of loss on the number of stages (i.e., how many trees are already in the model). I drew this plot mostly for debugging. I expected the loss to drop for the first 16 steps, and my initial results didn’t match because of a few bugs. Now the curves look OK at first glance — but there are a couple of interesting details worth staring at.
Loss drops on train for steps 1…16 — then stops
For points 1…16, the train loss steadily goes down. After that it mostly stops — which is exactly what I expected.
At each stage I’m using a decision stump (a tree of height 1). Such a tree effectively uses one feature per step. Each new feature can add new information and reduce the loss. Once the useful variables are exhausted, there’s nothing left to squeeze out.
The train–test gap is huge
What I don’t fully understand is the big gap between train and test. It looks like overfitting.
It might be interesting to run the same setup with different parameters and see whether the gap can be reduced (learning rate / regularization / subsampling / minimum leaf size — all the usual knobs).
A weird flat segment on the test curve around steps 8→9
Another thing: on train, the loss decreases at each step. But on test, there’s an almost horizontal segment between the 8th and 9th points. Why?
My first guess: the first 8 trees mostly exploit one group of features, and around step 9 the model “switches” and starts using the other group for the first time. But then the question becomes: why do those features generalize worse? Are they weaker, noisier, more correlated, or do they interact with the train/test split in a strange way?
So many interesting questions.
Group emblem
Nobody asked me, but I’ll tell you how the channel emblem was born: one quick pen sketch + a single prompt in ChatGPT.
And… that’s the funny part: it worked on the first try.
Nobody asked me, but I’ll tell you how the channel emblem was born: one quick pen sketch + a single prompt in ChatGPT.
It's an approximate version of the sign for telegram channel. Channel name is phys_math_dev. Topics physics, mathematics, development. Phi stands for physics, sum for mathematics and showel for development. I want you to come up with breathtaking logo on this theme. Main colors are red and gold with glossy look
And… that’s the funny part: it worked on the first try.
👍1
2D sort
In The Farmer Was Replaced there is a subgame in which you are to sort a 2D field. I tried several options and liked the 2D insertion-sort algorithms the most. It’s exactly what I want to speak about today.
First of all, let’s recall what we can do in the game. There is:
🌵 move(East|North|West|South) for movement;
🌵 till() to switch soil type;
🌵 plant() to, sorry, plant different, sorry, plants;
🌵 swap(East|North|West|South) to swap the current cell with a neighbouring cell in a given direction;
🌵 harvest() to harvest ripe plants.
This story is about cacti. You have an enormous bonus if you harvest the whole field at once — and it happens when all cacti are sorted.
What does “sorted” mean here?
🌵 For each cell: measure() <= measure(East) and measure() <= measure(North) (when those neighbours exist).
🌵 In other words: each row goes in non-decreasing left-to-right order and each column goes in non-decreasing bottom-to-up order.
Now let’s check the picture. In our algorithm we traverse the field right-to-left, top-to-bottom, and apply one “insertion” iteration to each new cactus we meet. For a new cactus a[p][q] the invariant is:
the subfield to the right and above is already sorted.
In one sorting iteration our task is to move a[p][q] to its place inside that already-sorted piece and not to break the order we already have.
At first I tried to reason directly:
if our cactus is lower than both of its upper and right neighbours, it should already be in place.
If it is taller than the right one but lower than the upper one, we swap with the right.
If it is taller than both neighbours… wow, wow, wow… stop.
Too many ifs. Hard to think about, hard to write a program — and we haven’t even started to handle borders (top row, rightmost column, missing neighbours).
When I stumbled upon it, I recalled that it reminds me something… exactly: sift_down in a heap. And there is a clever trick one can use.
Two stages.
Stage 1: take up to three cells — current, right, and up (skip the neighbours that don’t exist) — and find the minimum value among them.
This is a very common mini-routine, so it’s easy to implement even with strict movement/swap restrictions.
Stage 2: make a decision.
If the minimum is at our current position — do nothing, we’re done.
Otherwise swap the current cell with the cell that holds the minimum, move to that swapped position, and repeat.
That’s it. No giant decision tree.
Let’s check the complexity of this approach.
Worst case: T = O(n³).
We have n² elements and for each of them insertion can travel O(n) steps through the already-sorted region.
Bad news is that it can be quite long on a totally shuffled field.
Good news is that insertion sort takes advantage of partial sorting and won’t do unnecessary work.
In The Farmer Was Replaced there is a subgame in which you are to sort a 2D field. I tried several options and liked the 2D insertion-sort algorithms the most. It’s exactly what I want to speak about today.
First of all, let’s recall what we can do in the game. There is:
🌵 move(East|North|West|South) for movement;
🌵 till() to switch soil type;
🌵 plant() to, sorry, plant different, sorry, plants;
🌵 swap(East|North|West|South) to swap the current cell with a neighbouring cell in a given direction;
🌵 harvest() to harvest ripe plants.
This story is about cacti. You have an enormous bonus if you harvest the whole field at once — and it happens when all cacti are sorted.
What does “sorted” mean here?
🌵 For each cell: measure() <= measure(East) and measure() <= measure(North) (when those neighbours exist).
🌵 In other words: each row goes in non-decreasing left-to-right order and each column goes in non-decreasing bottom-to-up order.
Now let’s check the picture. In our algorithm we traverse the field right-to-left, top-to-bottom, and apply one “insertion” iteration to each new cactus we meet. For a new cactus a[p][q] the invariant is:
the subfield to the right and above is already sorted.
In one sorting iteration our task is to move a[p][q] to its place inside that already-sorted piece and not to break the order we already have.
At first I tried to reason directly:
if our cactus is lower than both of its upper and right neighbours, it should already be in place.
If it is taller than the right one but lower than the upper one, we swap with the right.
If it is taller than both neighbours… wow, wow, wow… stop.
Too many ifs. Hard to think about, hard to write a program — and we haven’t even started to handle borders (top row, rightmost column, missing neighbours).
When I stumbled upon it, I recalled that it reminds me something… exactly: sift_down in a heap. And there is a clever trick one can use.
Two stages.
Stage 1: take up to three cells — current, right, and up (skip the neighbours that don’t exist) — and find the minimum value among them.
This is a very common mini-routine, so it’s easy to implement even with strict movement/swap restrictions.
Stage 2: make a decision.
If the minimum is at our current position — do nothing, we’re done.
Otherwise swap the current cell with the cell that holds the minimum, move to that swapped position, and repeat.
That’s it. No giant decision tree.
Let’s check the complexity of this approach.
Worst case: T = O(n³).
We have n² elements and for each of them insertion can travel O(n) steps through the already-sorted region.
Bad news is that it can be quite long on a totally shuffled field.
Good news is that insertion sort takes advantage of partial sorting and won’t do unnecessary work.
Media is too big
VIEW IN TELEGRAM
2D insertion sort. Implementation.
In the previous post I described a nice 2D array sorting approach based on insertion sort. Today you can see how it behaves in practice — watch the video. I personally love this kind of algorithm visualization.
Just watching the drone already gives a couple of insights. It doesn’t always travel far from the insertion point — and that’s the key property of insertion sort: the work depends not only on n, but also on how “sorted” the data already is. So an almost-sorted field gets fixed surprisingly fast.
Now compare it with two other classic quadratic algorithms.
Selection sort and bubble sort don’t really care what’s inside — they keep scanning the whole unsorted part anyway. That’s why their basic “effort budget” is always about n·(n−1)/2 comparisons, no matter how lucky the input is.
And here we have a nice bridge from “toy programming” in The Farmer Was Replaced to serious computer science.
A famous real-world trick: quicksort is great, but deep recursion and tiny partitions are expensive. So many implementations stop quicksort early, leaving the array only almost sorted — and then run insertion sort as a final polish pass.
Code
In the previous post I described a nice 2D array sorting approach based on insertion sort. Today you can see how it behaves in practice — watch the video. I personally love this kind of algorithm visualization.
Just watching the drone already gives a couple of insights. It doesn’t always travel far from the insertion point — and that’s the key property of insertion sort: the work depends not only on n, but also on how “sorted” the data already is. So an almost-sorted field gets fixed surprisingly fast.
Now compare it with two other classic quadratic algorithms.
Selection sort and bubble sort don’t really care what’s inside — they keep scanning the whole unsorted part anyway. That’s why their basic “effort budget” is always about n·(n−1)/2 comparisons, no matter how lucky the input is.
And here we have a nice bridge from “toy programming” in The Farmer Was Replaced to serious computer science.
A famous real-world trick: quicksort is great, but deep recursion and tiny partitions are expensive. So many implementations stop quicksort early, leaving the array only almost sorted — and then run insertion sort as a final polish pass.
Code
Left hand rule in The Farmer Was Replaced
When I first solved the TFWR maze, I reached for DFS without thinking. But when I tried to explain the game to a less seasoned programmer, I realized DFS quietly assumes you’re comfortable with recursion, sets, visited states… not exactly “fun-first”.
So I finally focused on the classic left-hand maze traversal that TFWR guides keep mentioning. And for the first time in my life, I actually coded it.
The four situations (picture 1)
The idea is simple: keep your left hand touching the wall.
➜Wall on the left, open ahead → go forward.
☛Left and forward blocked, right open → turn right.
➽Everything except backward blocked → turn back.
➳The weird one: no wall on the left → you just moved forward and discovered an opening on the left. To “restore contact” with the wall, turn left and step into that passage.
Now the nice part: all four cases collapse into one tiny routine:
ᐉTurn left once, then while forward is blocked, turn right.
That’s it.
Let's consider tools for this task.
Directions from zero point, counterclockwise. Don't use one-letter names in production, please.
Our initial direction - East. +1 - counterclockwise, -1 - clockwise. God bless Guido van Rossum - % 4 always gives numbers from 0 to 3 inclusive. In C++ it would be slightly less straightforward.
All together now
When I first solved the TFWR maze, I reached for DFS without thinking. But when I tried to explain the game to a less seasoned programmer, I realized DFS quietly assumes you’re comfortable with recursion, sets, visited states… not exactly “fun-first”.
So I finally focused on the classic left-hand maze traversal that TFWR guides keep mentioning. And for the first time in my life, I actually coded it.
The four situations (picture 1)
The idea is simple: keep your left hand touching the wall.
➜Wall on the left, open ahead → go forward.
☛Left and forward blocked, right open → turn right.
➽Everything except backward blocked → turn back.
➳The weird one: no wall on the left → you just moved forward and discovered an opening on the left. To “restore contact” with the wall, turn left and step into that passage.
Now the nice part: all four cases collapse into one tiny routine:
ᐉTurn left once, then while forward is blocked, turn right.
That’s it.
Let's consider tools for this task.
d = [East, North, West, South]
Directions from zero point, counterclockwise. Don't use one-letter names in production, please.
dc = 0
dc = (dc + 1) % 4
dc = (dc - 1) % 4
Our initial direction - East. +1 - counterclockwise, -1 - clockwise. God bless Guido van Rossum - % 4 always gives numbers from 0 to 3 inclusive. In C++ it would be slightly less straightforward.
All together now
d = [East, North, West, South]
# black magic to conjure the maze
plant(Entities.Bush)
substance = get_world_size() * 2**(num_unlocked(Unlocks.Mazes) - 1)
use_item(Items.Weird_Substance, substance)
dc = 0
while get_entity_type() != Entities.Treasure:
dc = (dc + 1) % 4
while not can_move(d[dc]):
dc = (dc - 1) % 4
move(d[dc])
harvest()
❤2🤔2
This media is not supported in your browser
VIEW IN TELEGRAM
TFWR. Left hand maze traversal.
Yesterday I published the code for left-hand maze traversal. Today you can hang out and watch a video of how it works.
Yesterday I published the code for left-hand maze traversal. Today you can hang out and watch a video of how it works.
❤1
Microsoft scientists declared that they will be replaced by AI
archive.ph
Microsoft researchers have revealed the 40 jobs most exposed to AI—an…
archived 19 Jan 2026 20:43:05 UTC
Your AI partner
I want to dilute hard stuff a little bit with shitposting. Let's compare our assistants, how they see us.
Prompt:
It's my AI assistant. Feel free to post yours in comments.
You know, it's surprisingly I don't know... touchy... I would like to be a less demanding partner, to be honest.
I want to dilute hard stuff a little bit with shitposting. Let's compare our assistants, how they see us.
Prompt:
Изобрази максимально
честную картинку о том, как я к
тебе относился за всё время.
It's my AI assistant. Feel free to post yours in comments.
You know, it's surprisingly I don't know... touchy... I would like to be a less demanding partner, to be honest.
😁1
Dict vs %
In the previous version of left-hand maze traversal the heavy lifting was done by %. It guarantees that when we turn left or right, our direction (number 0,1,2,3 stored in dc) stays in the 0-3 range. One can use a dict to straightly map a current direction to the next after CV or CCV turn.
Let's compare:
old
new
One more trick - move does nothing (and return False) when the wall is in front of us. So we can combine can_move and move.
Of course, it works because we introduced dictionaries:
code
In the previous version of left-hand maze traversal the heavy lifting was done by %. It guarantees that when we turn left or right, our direction (number 0,1,2,3 stored in dc) stays in the 0-3 range. One can use a dict to straightly map a current direction to the next after CV or CCV turn.
Let's compare:
old
dc = 0
while get_entity_type() != Entities.Treasure:
dc = (dc + 1) % 4
while not can_move(d[dc]):
dc = (dc - 1) % 4
move(d[dc])
new
dc = East
while get_entity_type() != Entities.Treasure:
dc = l[dc]
while not move(dc):
dc = r[dc]
One more trick - move does nothing (and return False) when the wall is in front of us. So we can combine can_move and move.
Of course, it works because we introduced dictionaries:
l = {East:North, North:West, West:South, South:East}
r = {East:South, South:West, West:North, North:East}code
❤1🔥1
GBDTE
It's quite hard to navigate the channel, so I created this navigation/summary post. It's about a pet project I started about ten years ago. The main idea is that we can use slightly modified gradient boosted decision trees to both group objects and find trends for these groups.
📈beginning - the very first picture, the whole idea
📈credit scoring - problem statement, temporal instability
📈dataset - dataset preparation, ytzaurus vs Oracle
📈Vanilla GBDTE - experiment with math in instant view
📈Small MSE Dataset - the first approach to synthetic dataset for MSE GBDTE
📉Extracting components - how to get perfect components from chaotic signal
📉Leafs and components - check tree leaves and plot components
📉Evil of defaults - a debugging session, culprit - default parameters
📉Big MSE dataset - scatterplot with more clear "Gradient Boosting" message
📉LogLoss dataset - non-stationary dataset for binary classification
🎲Experiment on LogLoss dataset - first approach for running the algorithm on the dataset
🎲bad results - a very important mistake! Why you shouldn't use interpolation factors as extrapolating ones
🎲illustration for unstable class - a picture for a presentation
🎲learning curves LogLoss - learning curves for LogLoss case (non-stationary binary classification)
It's quite hard to navigate the channel, so I created this navigation/summary post. It's about a pet project I started about ten years ago. The main idea is that we can use slightly modified gradient boosted decision trees to both group objects and find trends for these groups.
📈beginning - the very first picture, the whole idea
📈credit scoring - problem statement, temporal instability
📈dataset - dataset preparation, ytzaurus vs Oracle
📈Vanilla GBDTE - experiment with math in instant view
📈Small MSE Dataset - the first approach to synthetic dataset for MSE GBDTE
📉Extracting components - how to get perfect components from chaotic signal
📉Leafs and components - check tree leaves and plot components
📉Evil of defaults - a debugging session, culprit - default parameters
📉Big MSE dataset - scatterplot with more clear "Gradient Boosting" message
📉LogLoss dataset - non-stationary dataset for binary classification
🎲Experiment on LogLoss dataset - first approach for running the algorithm on the dataset
🎲bad results - a very important mistake! Why you shouldn't use interpolation factors as extrapolating ones
🎲illustration for unstable class - a picture for a presentation
🎲learning curves LogLoss - learning curves for LogLoss case (non-stationary binary classification)
👍1😐1