Liquid Nitrogen Station
An attentive reader of one of my previous posts about the diamond plate in an electron microscope could notice a repetitive note: let's pour liquid nitrogen here, let's pour liquid nitrogen there. Sounds like a lot of liquid nitrogen. And it is. When I worked at the Physical Institute of the Russian Academy of Sciences, it was an everyday workout—to go with a special Dewar bowl and bring 17 liters of liquid nitrogen to the laboratory. Like 13 liters went to pump the air out of the microscope and cool down the specimen during the measurement. The remaining 4 liters evaporated during the night.
It's quite an interesting liquid and, when you have it in abundance, you can have a lot of fun. I heard that our colleagues froze ice cream with it. I myself conducted all the well-known experiments: freeze and cleave a piece of rubber. Dip your hand into it. The most amazing thing was to put a porous plastic which was widespread in the Soviet Union for packing scientific equipment. In normal conditions it's a springy material that you can compress and then it restores its original shape. But when you put it into liquid nitrogen, it becomes extremely brittle, and when you squeeze it, it becomes just a small heap of thin powder.
One more experiment I conducted inadvertently. When I was pouring liquid nitrogen into the photodetector bowl, a narrow stream escaped and soaked my jeans. So, when I unbent my leg, my jeans cracked. Jeans. Cracked. It was fun.
An attentive reader of one of my previous posts about the diamond plate in an electron microscope could notice a repetitive note: let's pour liquid nitrogen here, let's pour liquid nitrogen there. Sounds like a lot of liquid nitrogen. And it is. When I worked at the Physical Institute of the Russian Academy of Sciences, it was an everyday workout—to go with a special Dewar bowl and bring 17 liters of liquid nitrogen to the laboratory. Like 13 liters went to pump the air out of the microscope and cool down the specimen during the measurement. The remaining 4 liters evaporated during the night.
It's quite an interesting liquid and, when you have it in abundance, you can have a lot of fun. I heard that our colleagues froze ice cream with it. I myself conducted all the well-known experiments: freeze and cleave a piece of rubber. Dip your hand into it. The most amazing thing was to put a porous plastic which was widespread in the Soviet Union for packing scientific equipment. In normal conditions it's a springy material that you can compress and then it restores its original shape. But when you put it into liquid nitrogen, it becomes extremely brittle, and when you squeeze it, it becomes just a small heap of thin powder.
One more experiment I conducted inadvertently. When I was pouring liquid nitrogen into the photodetector bowl, a narrow stream escaped and soaked my jeans. So, when I unbent my leg, my jeans cracked. Jeans. Cracked. It was fun.
Big MSE dataset for GBDT
In the previous post I demonstrated a small dataset that shows how GBDT works. Now I want to present you quite a big one. There are 10 000 points in it.
I wasn't happy with the quality of the text in the small dataset, so I decided to repeat the algorithm and take more points this time.
While it's quite hard to set many points with a pen, it's easy to ask an iron friend to sample them.
At this point I got stuck for a while with an unexpected problem: I didn't like the fonts. I work in Linux, and mostly fonts don't bother me. Until now. I wanted a nice and interesting font for this task. I have no clue what "interesting" means here—probably a fat, rounded font. I don't know. And then a strange thing happened. When I googled something like "try different fonts online tool", I got nothing. Probably I was banned by Google that day, or I was extremely unlucky, but all the pages I came up with were non-functional or solved some other problem. At last, somehow I got to the... ta-da... fonts.google.com page. But the chain of thought wasn't straight.
Then progress started to go faster. My iron friend quickly rendered the title and scattered 10 000 points across it. Separation into components wasn't hard either. So I ended up with a dataset of 10 000 points separated into 128 groups.
Let's pause here. There's quite an interesting thing about tree height and the number of groups. I want to talk about it slowly and clearly, so let's do it in the next post.
In the previous post I demonstrated a small dataset that shows how GBDT works. Now I want to present you quite a big one. There are 10 000 points in it.
I wasn't happy with the quality of the text in the small dataset, so I decided to repeat the algorithm and take more points this time.
While it's quite hard to set many points with a pen, it's easy to ask an iron friend to sample them.
At this point I got stuck for a while with an unexpected problem: I didn't like the fonts. I work in Linux, and mostly fonts don't bother me. Until now. I wanted a nice and interesting font for this task. I have no clue what "interesting" means here—probably a fat, rounded font. I don't know. And then a strange thing happened. When I googled something like "try different fonts online tool", I got nothing. Probably I was banned by Google that day, or I was extremely unlucky, but all the pages I came up with were non-functional or solved some other problem. At last, somehow I got to the... ta-da... fonts.google.com page. But the chain of thought wasn't straight.
Then progress started to go faster. My iron friend quickly rendered the title and scattered 10 000 points across it. Separation into components wasn't hard either. So I ended up with a dataset of 10 000 points separated into 128 groups.
Let's pause here. There's quite an interesting thing about tree height and the number of groups. I want to talk about it slowly and clearly, so let's do it in the next post.
Big MSE dataset. Depth of trees.
There are 128 groups in the Big dataset. And to distinguish them perfectly, it's necessary to use exactly 7 binary features. Why is it so? Because when we add one more level to a decision tree, we change one leaf into two, replacing it with a decision rule. So, if we start with one root and build 7 levels over it, we will have exactly 128 leaves—one leaf for each group.
This illustrates a common phrase about GBDT: "Height of trees is the number of factors we want to work together." I have always heard this phrase, but it was just a theory for me, because in real‑world datasets you have no idea how many factors should work together.
In this dataset we know the exact number of factors to work together—7—and we can test that intuition. And that's exactly the story these three plots are telling us. At the top—trees have only two levels. The model has a limited ability to learn, and we see classic learning curves: the model grasps some generic knowledge and both train and test curves go down. Then it starts to learn noise and the test curve goes up. We can see that, because of the low depth, it can't fit the train set with good quality. It's exactly what we saw in the previous post on the plot "predicted values on train data."
If we check the lower graph, we can see that with depth 7 the MSE becomes much lower and we see a better correspondence between the dataset and the model's inference.
The third image is about the learning rate. lr for upper plots is 0.3 and for the lower 0.1 You can see that the learning curves are less steep. But it doesn't help the asymptotic value of the train curve. When points are wrongly attributed to groups in the initial steps, there is no way to re‑attribute them later.
There are 128 groups in the Big dataset. And to distinguish them perfectly, it's necessary to use exactly 7 binary features. Why is it so? Because when we add one more level to a decision tree, we change one leaf into two, replacing it with a decision rule. So, if we start with one root and build 7 levels over it, we will have exactly 128 leaves—one leaf for each group.
This illustrates a common phrase about GBDT: "Height of trees is the number of factors we want to work together." I have always heard this phrase, but it was just a theory for me, because in real‑world datasets you have no idea how many factors should work together.
In this dataset we know the exact number of factors to work together—7—and we can test that intuition. And that's exactly the story these three plots are telling us. At the top—trees have only two levels. The model has a limited ability to learn, and we see classic learning curves: the model grasps some generic knowledge and both train and test curves go down. Then it starts to learn noise and the test curve goes up. We can see that, because of the low depth, it can't fit the train set with good quality. It's exactly what we saw in the previous post on the plot "predicted values on train data."
If we check the lower graph, we can see that with depth 7 the MSE becomes much lower and we see a better correspondence between the dataset and the model's inference.
The third image is about the learning rate. lr for upper plots is 0.3 and for the lower 0.1 You can see that the learning curves are less steep. But it doesn't help the asymptotic value of the train curve. When points are wrongly attributed to groups in the initial steps, there is no way to re‑attribute them later.
GBDTE log‑loss dataset
In this post I want to solemnly declare: I'm not a mathematician. My friends who are, I'm totally sure, would solve this problem without effort using Bayes' formalism. I can only wave my hands.
So, what's the fuss?
I want to generate a synthetic dataset with properties similar to those of the initial fraudulent‑users dataset. And I want to control how much information the features bring about the target value. Moreover, I want to introduce a new variable to the dataset—t (time)—and add time dependence to the dataset's statistical properties.
Because of my math inaptitude, I use my intuition from physics problem‑solving. First, let's draw the figure you see in the picture. The horizontal axis stands for the binary factor f (feature). The vertical axis stands for the binary target l (label). The height of the horizontal line that splits l=0 and l=1 has a well‑defined meaning—it's the average value of our target, α. Let α = 0.5. That's the second equation, because the first is a + b + c + d = 1.
Then we can think about the average value of the factor. I want to have 16 factors in my dataset, and I want them, in total, to give slightly less information than would allow 100% recovery of the target. So the average factor value β should be 1/16 = 0.0625. β is the coverage—how often the factor equals 1. Third equation.
And finally, the lift. It's a ratio: in the numerator is the probability that the target equals 1 when the factor equals 1, and in the denominator is the average target value. In terms of our variables, d/(d+b) is the average target when f = 1, and α is the average target value.
When lift = 1, the factor gives no information about the target. When it's > 1, it shows how much stronger our position is when using this factor. For example, a lift of 1.3 shows that we would catch 30% more credit‑fraud users when using this factor. It's convenient to use log‑lift: it is 0 when there is no gain and has the same sign as the correlation between the target and the factor.
For my dataset I want time to run from 0 to 1, and I want two groups of factors: with the lift going up and with the lift going down. The expressions for these lifts are quite simple:
Now we have four variables and four equations to determine them. I solved the system using Gaussian elimination, and the result is in the lower picture.
I'm going to implement these expressions in the synthetic dataset generation script. It's already done, but I wanted to recap the logic behind it. Next time—the dataset.
In this post I want to solemnly declare: I'm not a mathematician. My friends who are, I'm totally sure, would solve this problem without effort using Bayes' formalism. I can only wave my hands.
So, what's the fuss?
I want to generate a synthetic dataset with properties similar to those of the initial fraudulent‑users dataset. And I want to control how much information the features bring about the target value. Moreover, I want to introduce a new variable to the dataset—t (time)—and add time dependence to the dataset's statistical properties.
Because of my math inaptitude, I use my intuition from physics problem‑solving. First, let's draw the figure you see in the picture. The horizontal axis stands for the binary factor f (feature). The vertical axis stands for the binary target l (label). The height of the horizontal line that splits l=0 and l=1 has a well‑defined meaning—it's the average value of our target, α. Let α = 0.5. That's the second equation, because the first is a + b + c + d = 1.
Then we can think about the average value of the factor. I want to have 16 factors in my dataset, and I want them, in total, to give slightly less information than would allow 100% recovery of the target. So the average factor value β should be 1/16 = 0.0625. β is the coverage—how often the factor equals 1. Third equation.
And finally, the lift. It's a ratio: in the numerator is the probability that the target equals 1 when the factor equals 1, and in the denominator is the average target value. In terms of our variables, d/(d+b) is the average target when f = 1, and α is the average target value.
When lift = 1, the factor gives no information about the target. When it's > 1, it shows how much stronger our position is when using this factor. For example, a lift of 1.3 shows that we would catch 30% more credit‑fraud users when using this factor. It's convenient to use log‑lift: it is 0 when there is no gain and has the same sign as the correlation between the target and the factor.
For my dataset I want time to run from 0 to 1, and I want two groups of factors: with the lift going up and with the lift going down. The expressions for these lifts are quite simple:
lift_up = 0.25 + 0.5*t
lift_down = 0.75 - 0.5*t
Now we have four variables and four equations to determine them. I solved the system using Gaussian elimination, and the result is in the lower picture.
I'm going to implement these expressions in the synthetic dataset generation script. It's already done, but I wanted to recap the logic behind it. Next time—the dataset.
In my other channel I published post about physical exposition which my father and I found in 1988. Quite an exclusive material . In Russian.
Telegram
Alina_Yerevan_frontend
ВДНХ. Профессор Соколов. 1988.
В 1988 году с отцом забрели на ВДНХ в павильон "Молодёжный". Мне 9 лет. Много разных экспозиций в маленьких, разделённых загородками закутках. В том числе - Занимательная Физика от Николая Николаевича Соколова. Что он Н.Н.…
В 1988 году с отцом забрели на ВДНХ в павильон "Молодёжный". Мне 9 лет. Много разных экспозиций в маленьких, разделённых загородками закутках. В том числе - Занимательная Физика от Николая Николаевича Соколова. Что он Н.Н.…
Riddle of the day
"What number, when you remove one letter from its spelling, transforms into an even number?"
UPD: There are slightly different options for this riddle, like "I’m an odd number. If I lose one letter, I become even. What number am I?"
"What number, when you remove one letter from its spelling, transforms into an even number?"
UPD: There are slightly different options for this riddle, like "I’m an odd number. If I lose one letter, I become even. What number am I?"
When is a door not a door?
I first heard this joke in the 1997 animated movie Anastasia, and it’s stuck with me ever since. In the film it’s treated like one of those classic jokes everyone’s supposed to know.
Do you know the answer? 😏
Share your guess in the comments!
I first heard this joke in the 1997 animated movie Anastasia, and it’s stuck with me ever since. In the film it’s treated like one of those classic jokes everyone’s supposed to know.
Do you know the answer? 😏
Share your guess in the comments!
TFWR. Labyrinth
Let’s think about the labyrinth problem in The Farmer Was Replaced game.
First of all, let’s state it. We have an
𝑛×𝑛 labyrinth, where n = get_world_size(). In position measure() there is a treasure. Our current position is (get_pos_x(), get_pos_y()). We can move our drone by issuing commands move(East|North|West|South). Our task is to find the treasure, which we can detect by the condition get_entity() == Entities.Treasure(). Information on the maze map we can obtain through the boolean can_move(East|North|West|South) function.
To be honest, I have no clue why the approach of “right hand” or “left hand” is so popular. I started with DFS (Depth First Search), and I want to explain this approach in the post.
It seems that it’s quite popular to start with these “hand” approaches. But they fail when you are trying to gain more money from your treasure hunt. According to the rules, the prize doubles when, instead of harvesting, you use weird substance on the treasure. In this case, you reuse the labyrinth, and the treasure jumps into some other place. You can do it 30 times (and should, if you want to have an efficient farm). Also, when the treasure jumps, some walls in the labyrinth disappear. There are two consequences of this disappearance. First of all, the “hand” approaches don’t work anymore. On the other hand, the maze simplifies and, if you use an appropriate algorithm, your drone finds the treasure faster and faster.
In this topic I want to show the simplest possible DFS code which doesn’t rely on a “strict tree structure” of the maze.
Let’s do it. DIRECTIONS – array with arguments for the move() function. visited – the set of positions we have visited so far. OPPOSITE – the dictionary with opposite directions.
The code is as simple as possible and because of this it contains a few obvious issues:
issue: in this algorithm the drone “jitters” forward and backward all the time
reason: it checks its current position in visited, so it really needs to move into a cell to check it
how to fix: write a “projection” function which can calculate the drone’s position after it makes a step and check it in visited
issue: this code collects the treasure immediately
reason: I wanted this code to be as simple as possible
how to fix: add logic to break the recursion when the drone is over the treasure, and create an outer loop for the labyrinth upgrade
Let’s think about the labyrinth problem in The Farmer Was Replaced game.
First of all, let’s state it. We have an
𝑛×𝑛 labyrinth, where n = get_world_size(). In position measure() there is a treasure. Our current position is (get_pos_x(), get_pos_y()). We can move our drone by issuing commands move(East|North|West|South). Our task is to find the treasure, which we can detect by the condition get_entity() == Entities.Treasure(). Information on the maze map we can obtain through the boolean can_move(East|North|West|South) function.
To be honest, I have no clue why the approach of “right hand” or “left hand” is so popular. I started with DFS (Depth First Search), and I want to explain this approach in the post.
It seems that it’s quite popular to start with these “hand” approaches. But they fail when you are trying to gain more money from your treasure hunt. According to the rules, the prize doubles when, instead of harvesting, you use weird substance on the treasure. In this case, you reuse the labyrinth, and the treasure jumps into some other place. You can do it 30 times (and should, if you want to have an efficient farm). Also, when the treasure jumps, some walls in the labyrinth disappear. There are two consequences of this disappearance. First of all, the “hand” approaches don’t work anymore. On the other hand, the maze simplifies and, if you use an appropriate algorithm, your drone finds the treasure faster and faster.
In this topic I want to show the simplest possible DFS code which doesn’t rely on a “strict tree structure” of the maze.
Let’s do it. DIRECTIONS – array with arguments for the move() function. visited – the set of positions we have visited so far. OPPOSITE – the dictionary with opposite directions.
def go_best(d1, d2, l1, l2):
d, l = d1, l1
if l2 < l1:
d, l = d2, l2
for _ in range(l):
move(d)
def nav(x2, y2):
n = get_world_size()
x1, y1 = get_pos_x(), get_pos_y()
go_best(East, West, (x2 - x1)%n, (x1 - x2)%n)
go_best(North, South, (y2 - y1)%n, (y1 - y2)%n)
def apply_proper_substance():
substance = get_world_size() * 2**(num_unlocked(Unlocks.Mazes) - 1)
use_item(Items.Weird_Substance, substance)
set_world_size(8)
nav(3, 3)
plant(Entities.Bush)
apply_proper_substance()
DIRECTIONS=[East, North, South, West]
OPPOSITE={East:West, West:East, North:South, South:North}
def dfs(visited):
if (get_pos_x(), get_pos_y()) in visited:
return
visited.add((get_pos_x(), get_pos_y()))
if get_entity_type() == Entities.Treasure:
harvest()
for dir_to_move in DIRECTIONS:
if can_move(dir_to_move):
move(dir_to_move)
dfs(visited)
move(OPPOSITE[dir_to_move])
visited = set()
dfs(visited)
while True:
pass
The code is as simple as possible and because of this it contains a few obvious issues:
issue: in this algorithm the drone “jitters” forward and backward all the time
reason: it checks its current position in visited, so it really needs to move into a cell to check it
how to fix: write a “projection” function which can calculate the drone’s position after it makes a step and check it in visited
issue: this code collects the treasure immediately
reason: I wanted this code to be as simple as possible
how to fix: add logic to break the recursion when the drone is over the treasure, and create an outer loop for the labyrinth upgrade
GBDTE: LogLoss dataset
My approach with the synthetic MSE dataset was successful. I created it, and all experiments gave me quite expected results. The next frontier is a synthetic dataset for testing the logloss function.
This is important for me because I started with this problem, and I want a clear demonstration, especially on synthetic data, that this approach works and that we can improve the stability of our model by incorporating time.
So I started with the same approach my friend and I used almost ten years ago when we published our article. We created a dataset with everything binary: a binary target and binary features. That means all regular features used for splits and the target can be either 0 or 1. The secret ingredient is time: a value in the range from 0 to 1, and a time‑dependent basis [1, t]. The goal of the model is to find weights w1 and w2 so that (w1*1 + w2*t) is the best possible score, which minimizes logloss on this dataset.
To make things interesting, we introduce time dependence into the dataset’s statistical properties. I decided that it’s convenient to make the lift time‑dependent.
Let’s stop here for a moment and discuss this term. We can calculate the target average over the whole dataset, and then over a group selected using a factor. Lift is the ratio of the average target over the selected subset to the average target over the whole dataset. My idea is to make this lift change over time.
There are two groups of static factors: f1..f8 with increasing lift and f9..f16 with decreasing lift. In the original setup the picture was slightly asymmetrical, and the points where the lifts were equal to one for the two groups were separated, but this time I set the dependencies gamma_up = 0.5 + t and gamma_down = 1.5 - t.
I expected to get a picture similar to what we had in our article, but this time I wasn’t as lucky as we were nine years ago. The picture was different. I have no idea why, and now I’m digging deep into the theory of time‑dependent binary datasets.
My approach with the synthetic MSE dataset was successful. I created it, and all experiments gave me quite expected results. The next frontier is a synthetic dataset for testing the logloss function.
This is important for me because I started with this problem, and I want a clear demonstration, especially on synthetic data, that this approach works and that we can improve the stability of our model by incorporating time.
So I started with the same approach my friend and I used almost ten years ago when we published our article. We created a dataset with everything binary: a binary target and binary features. That means all regular features used for splits and the target can be either 0 or 1. The secret ingredient is time: a value in the range from 0 to 1, and a time‑dependent basis [1, t]. The goal of the model is to find weights w1 and w2 so that (w1*1 + w2*t) is the best possible score, which minimizes logloss on this dataset.
To make things interesting, we introduce time dependence into the dataset’s statistical properties. I decided that it’s convenient to make the lift time‑dependent.
Let’s stop here for a moment and discuss this term. We can calculate the target average over the whole dataset, and then over a group selected using a factor. Lift is the ratio of the average target over the selected subset to the average target over the whole dataset. My idea is to make this lift change over time.
There are two groups of static factors: f1..f8 with increasing lift and f9..f16 with decreasing lift. In the original setup the picture was slightly asymmetrical, and the points where the lifts were equal to one for the two groups were separated, but this time I set the dependencies gamma_up = 0.5 + t and gamma_down = 1.5 - t.
I expected to get a picture similar to what we had in our article, but this time I wasn’t as lucky as we were nine years ago. The picture was different. I have no idea why, and now I’m digging deep into the theory of time‑dependent binary datasets.
This media is not supported in your browser
VIEW IN TELEGRAM
Bubble sort of one cacti column in TFWR
This game provides quite a nice opportunity to see how different sorting algorithms work. In this post, let's discuss the famous bubble sort.
You can check the code on github.
Let's check how it is built.
best_move is a helper function for nav. nav allows us to move the drone to any given coordinates on the field.
Then, in the script I set the farm size to 16, just to test moving the drone to the point (6, 6). Till the soil, so we can plant cacti. Then - plant cacti. And, finally, lines 32-36 - bubble sort.
For me, here it's interesting to clearly set variables and their physical sense. I don't want to do unnecessary work and I want to correctly handle corner cases. So:
* y_upper - is the last cell we want to swap cacti with
Let's stop here and think, what we can derive from this statement. It means that y_upper moves from (n-1) to 1 inclusive. So, small subtask - to set range for y_upper correctly. It's range(n-1, 0, -1).
After that, everything is simple. The inner cycle, which sets the drone's position, just loops from 0 to y_upper - 1, which gives a simple range(y_upper). Basically, that's it. The measure() function from the game is very handy for this task, comparison with the upper cell is just
as well as swapping with this cell
One more interesting thing. The map is on the torus, so in the beginning of sorting the drone "rotates" in one direction. But when more than half of the column is sorted, it starts to move back and forth.
This game provides quite a nice opportunity to see how different sorting algorithms work. In this post, let's discuss the famous bubble sort.
You can check the code on github.
Let's check how it is built.
best_move is a helper function for nav. nav allows us to move the drone to any given coordinates on the field.
Then, in the script I set the farm size to 16, just to test moving the drone to the point (6, 6). Till the soil, so we can plant cacti. Then - plant cacti. And, finally, lines 32-36 - bubble sort.
For me, here it's interesting to clearly set variables and their physical sense. I don't want to do unnecessary work and I want to correctly handle corner cases. So:
* y_upper - is the last cell we want to swap cacti with
Let's stop here and think, what we can derive from this statement. It means that y_upper moves from (n-1) to 1 inclusive. So, small subtask - to set range for y_upper correctly. It's range(n-1, 0, -1).
After that, everything is simple. The inner cycle, which sets the drone's position, just loops from 0 to y_upper - 1, which gives a simple range(y_upper). Basically, that's it. The measure() function from the game is very handy for this task, comparison with the upper cell is just
measure() > measure(North)
as well as swapping with this cell
swap(North)
One more interesting thing. The map is on the torus, so in the beginning of sorting the drone "rotates" in one direction. But when more than half of the column is sorted, it starts to move back and forth.
TFWR. Spawn drones.
In "The Farmer Was Replaced" game there is quite an interesting option - one can spawn drones in order to operate the farm more efficiently. This option is advanced because of language tricks and game limitations. Nevertheless, the problems and opportunities are quite similar to those which "adult" programmers have with multithreading and multiprocessing. Therefore, it's quite an interesting topic to master.
In this post, I just want to show the very basic use of this technology. I drew a small smiley on a piece of paper and put the coordinates of pixels in this simple picture into a list of tuples. Then I ask one drone to visit these places one by one and call the spawn_drone() function at each place. This spawn_drone function requires a function as a parameter - hello, metaprogramming - and executes this function in the new drone. This time the function is very simple: wait_forever. It just does nothing. But an infinite loop inside the function keeps the drone alive. When the function ends, the corresponding drone ceases to exist.
All in all, you can now see a smiling face made of drones. Happy hacking!
Code on github
In "The Farmer Was Replaced" game there is quite an interesting option - one can spawn drones in order to operate the farm more efficiently. This option is advanced because of language tricks and game limitations. Nevertheless, the problems and opportunities are quite similar to those which "adult" programmers have with multithreading and multiprocessing. Therefore, it's quite an interesting topic to master.
In this post, I just want to show the very basic use of this technology. I drew a small smiley on a piece of paper and put the coordinates of pixels in this simple picture into a list of tuples. Then I ask one drone to visit these places one by one and call the spawn_drone() function at each place. This spawn_drone function requires a function as a parameter - hello, metaprogramming - and executes this function in the new drone. This time the function is very simple: wait_forever. It just does nothing. But an infinite loop inside the function keeps the drone alive. When the function ends, the corresponding drone ceases to exist.
All in all, you can now see a smiling face made of drones. Happy hacking!
Code on github
🔥2
GBDTE. Culprit is found.
I started to dig really deep and, with the help of the iron friend, I checked several hypotheses. I found the optimal expression for the ideal model for this dataset. And this result is worth a separate publication. It was interesting to find out that my intuition about the linear model and linear lift was totally correct: in the dataset I constructed, the optimal model's score linearly depends on time.
Today I decided to compare all these theoretical results with one step of my model. You can see the picture at the beginning of the article. This time the problem is in vibecoding. The model decided that it's a good idea to put all features into an extra part of the dataset. It's quite obvious what to do next: separate them and test again. Stay tuned.
I started to dig really deep and, with the help of the iron friend, I checked several hypotheses. I found the optimal expression for the ideal model for this dataset. And this result is worth a separate publication. It was interesting to find out that my intuition about the linear model and linear lift was totally correct: in the dataset I constructed, the optimal model's score linearly depends on time.
Today I decided to compare all these theoretical results with one step of my model. You can see the picture at the beginning of the article. This time the problem is in vibecoding. The model decided that it's a good idea to put all features into an extra part of the dataset. It's quite obvious what to do next: separate them and test again. Stay tuned.
Ultraviolet lamp
In my other channel I published a post about vanished place where I worked in 1991-1992
In my other channel I published a post about vanished place where I worked in 1991-1992
Telegram
Alina_Yerevan_frontend
Физбар. ГОТ. Центральный павильон. 1990-1992
В предыдущих публикациях я начал рассказывать о городке открытий и творчества для детей и юношества, который работал в центральном павильоне ВДНХ в начале лихих девяностых.
К моему большому удивлению, от этого…
В предыдущих публикациях я начал рассказывать о городке открытий и творчества для детей и юношества, который работал в центральном павильоне ВДНХ в начале лихих девяностых.
К моему большому удивлению, от этого…
LLM in education
Sorry for this quite a rather unoriginal topic, but it bothers me. I'm teaching my son programming. So far we have discussed it in a classical way: variables, loops, algorithms. We spend hours discussing how to build stacks, queues and heaps over arrays. We wrote qsort, merge sort in Python, and now in C++.
Recently it's almost impossible to avoid "vibe coding" – Visual Studio Code pastes like ten lines of code when you just define a function drawSun, and these lines are exactly the same you wanted to discuss and carefully build word by word.
I'm trying to adapt. We accept these lines, discuss them, and I'm trying to urge my son to learn the ideas behind them.
But I'm unsure. Maybe it's better to become just an LLM Luddite during the learning process? Or separate "vibe coding" lessons from the usual lessons? One time we are trying to build an application as fast as possible, and another time we are training to write all these nuts and bolts from scratch?
Please, feel free to share your thoughts on my case in comments. Opinions and best practices are totally appreciated.
There is no policy like "english only". I'm writing in English just for practicing the writing skill. There is autotranslate function turned on. You can write comments in English, Russian, whatever works for you best.
Нет никакой политики насчёт языка. Я пишу посты на Английском только для прокачки писательского скила. Стоит автопереводчик, давайте общаться, как кому удобнее.
Sorry for this quite a rather unoriginal topic, but it bothers me. I'm teaching my son programming. So far we have discussed it in a classical way: variables, loops, algorithms. We spend hours discussing how to build stacks, queues and heaps over arrays. We wrote qsort, merge sort in Python, and now in C++.
Recently it's almost impossible to avoid "vibe coding" – Visual Studio Code pastes like ten lines of code when you just define a function drawSun, and these lines are exactly the same you wanted to discuss and carefully build word by word.
I'm trying to adapt. We accept these lines, discuss them, and I'm trying to urge my son to learn the ideas behind them.
But I'm unsure. Maybe it's better to become just an LLM Luddite during the learning process? Or separate "vibe coding" lessons from the usual lessons? One time we are trying to build an application as fast as possible, and another time we are training to write all these nuts and bolts from scratch?
Please, feel free to share your thoughts on my case in comments. Opinions and best practices are totally appreciated.
There is no policy like "english only". I'm writing in English just for practicing the writing skill. There is autotranslate function turned on. You can write comments in English, Russian, whatever works for you best.
Нет никакой политики насчёт языка. Я пишу посты на Английском только для прокачки писательского скила. Стоит автопереводчик, давайте общаться, как кому удобнее.
Topics request
Now I have a list of topics for the nearest posts.
* video for the first vibe coding project
* why binary features are bad extrapolation features (explanation why quality goes down when interpolating features are included into extra variables)
* import in TWFR
* relation of ROC and trees
* for in python and in c-group languages
* superformula
* [(0,0), (1,0), (2,0), (3,0), (9,0)]
* list of interesting disney science works
* 3d model for toricelly jets
* jets intersection problem for viscous speed constraint
* left hand maze traversal + tremaux's algorithm
* temperature drop due to the salt in snow
* pay for google AI services
* viscosity-limited velocities for jets
* TFWR 300 reuses - random walk
* top works in Computer Science in last 30 years
* insertion sort with multidrones
* TFWRF - companion 3x3 resource farming
* operator % - properties, tasks, problems
* operator ^ - properties, tasks, problems
* quinto-quartian circle in music
* transposition of melody using %
* recall that weird algorithm of sorting with decks
* pumpkins with queue
* 4 definitions of binomial coefficient
* Remainder by 2 for binomial coefficient
* Shap, boosting features
* bridge from Logistic optimization to optimal Bayesian model
* compare logistic regression with optimal Bayesian model on Titanic
* derive neutral value for absent factor
* continue to work with Bayesian models on Titanic
* Prony's method
* Binomial coefficients as a crossroad
* two unpaired numbers in array with pairs
* fast iterator/slow iterator
* calculate polynome value with data in iterator
Feel free to post what bothers you in comments. I'll consider your themes for future posts. Let's build an interesting place together. (russian or english - either way works)
English:
pole
Martin: break your fast
morphological intuition flails
Now I have a list of topics for the nearest posts.
* video for the first vibe coding project
* why binary features are bad extrapolation features (explanation why quality goes down when interpolating features are included into extra variables)
* import in TWFR
* relation of ROC and trees
* for in python and in c-group languages
* superformula
* [(0,0), (1,0), (2,0), (3,0), (9,0)]
* list of interesting disney science works
* 3d model for toricelly jets
* jets intersection problem for viscous speed constraint
* left hand maze traversal + tremaux's algorithm
* temperature drop due to the salt in snow
* pay for google AI services
* viscosity-limited velocities for jets
* TFWR 300 reuses - random walk
* top works in Computer Science in last 30 years
* insertion sort with multidrones
* TFWRF - companion 3x3 resource farming
* operator % - properties, tasks, problems
* operator ^ - properties, tasks, problems
* quinto-quartian circle in music
* transposition of melody using %
* recall that weird algorithm of sorting with decks
* pumpkins with queue
* 4 definitions of binomial coefficient
* Remainder by 2 for binomial coefficient
* Shap, boosting features
* bridge from Logistic optimization to optimal Bayesian model
* compare logistic regression with optimal Bayesian model on Titanic
* derive neutral value for absent factor
* continue to work with Bayesian models on Titanic
* Prony's method
* Binomial coefficients as a crossroad
* two unpaired numbers in array with pairs
* fast iterator/slow iterator
* calculate polynome value with data in iterator
Feel free to post what bothers you in comments. I'll consider your themes for future posts. Let's build an interesting place together. (russian or english - either way works)
English:
pole
Martin: break your fast
morphological intuition flails
Algorithms. Physics. Mathematics. Machine Learning. pinned «Topics request Now I have a list of topics for the nearest posts. * video for the first vibe coding project * why binary features are bad extrapolation features (explanation why quality goes down when interpolating features are included into extra variables)…»
The First Vibe Coding Project
In the chat about TFWR, one of the subscribers asked how to learn Python and HTML. I answered like this: “Feel free to vibecode a small application that uses Python, HTML, and CSS, and then use an LLM to talk about this application.” Word for word, I promised to show how to create such applications. And here we are.
Let's assume that we have a subscription to one of the contemporary LLM systems. I'll use Codex from OpenAI, but you can do the same with Claude or one of Google's models. They are approximately the same for simple tasks like this one.
Then I want to create a repository on GitHub. Basically, it's just the “Create a new repository” button. I like to create a README file, so don't forget to press this button. Then clone the repository with the git clone command. All these steps are optional for you but mandatory for me, because I want to put a reference to this project at the end of this post.
The main prerequisite for this tutorial is that you have an OpenAI subscription. Let's start our practice.
I can't memorise the command, so I asked Google “how to install Codex CLI” and got npm install -g @openai/codex. I ran it in the clone of my repository and... Attention! This is the most interesting part. I asked it:
And basically, that is it. You can check the result in the github repo . A screenshot of the result is at the beginning of this post.
You can check both en and ru versions of my prompt.
In the chat about TFWR, one of the subscribers asked how to learn Python and HTML. I answered like this: “Feel free to vibecode a small application that uses Python, HTML, and CSS, and then use an LLM to talk about this application.” Word for word, I promised to show how to create such applications. And here we are.
Let's assume that we have a subscription to one of the contemporary LLM systems. I'll use Codex from OpenAI, but you can do the same with Claude or one of Google's models. They are approximately the same for simple tasks like this one.
Then I want to create a repository on GitHub. Basically, it's just the “Create a new repository” button. I like to create a README file, so don't forget to press this button. Then clone the repository with the git clone command. All these steps are optional for you but mandatory for me, because I want to put a reference to this project at the end of this post.
The main prerequisite for this tutorial is that you have an OpenAI subscription. Let's start our practice.
I can't memorise the command, so I asked Google “how to install Codex CLI” and got npm install -g @openai/codex. I ran it in the clone of my repository and... Attention! This is the most interesting part. I asked it:
Create an app where I can manage a simple todo list. This is a learning project; I mainly want to understand how to write programs in Python, CSS, and HTML. Use Python for the backend, and HTML, CSS, and JS for the frontend. The app must let me enter a new task for the list, mark tasks as completed, and delete a task from the list. The user interface should include an input field, an add button, and a delete button. Each task in the list should be markable as done.
And basically, that is it. You can check the result in the github repo . A screenshot of the result is at the beginning of this post.
You can check both en and ru versions of my prompt.
This media is not supported in your browser
VIEW IN TELEGRAM
TFWR. Drone coordination.
Last time we discussed how to create drones. Today let's discuss how to coordinate their activity.
In the game there is quite a limited number of tricks that allow you to coordinate drones. Let's use the approach with "the central coordinator drone" and use the number of drones as a synchronization tool.
I think one of the simplest tasks is to plant sunflowers, so let's do it. The pool of tasks is the pool of columns. I want to keep alive the main drone we have on screen and ask it to spread tasks over the other drones. Let's discuss the nuts and bolts of this approach.
Let's check the spawn_drone function. It takes as a parameter a function which operates a new drone. I have 32 drones and 32 columns. So it's necessary to create a tool for the creation of these parameterized drone functions. Let's do it with the "factory function" approach. It takes the number of a column as a parameter and returns a drone function with proper parameterization.
A simple function to plant all sunflowers in a given column:
And, last but not least, the code for the manager
Let's create a list of tasks
Dispatch them between drones
This "pass" loop is an "interprocess synchronization" tool that uses the number of drones as a synchronization object. When the "while tasks" loop finishes, the job is done - the field is full of sunflowers.
Code on github
Last time we discussed how to create drones. Today let's discuss how to coordinate their activity.
In the game there is quite a limited number of tricks that allow you to coordinate drones. Let's use the approach with "the central coordinator drone" and use the number of drones as a synchronization tool.
I think one of the simplest tasks is to plant sunflowers, so let's do it. The pool of tasks is the pool of columns. I want to keep alive the main drone we have on screen and ask it to spread tasks over the other drones. Let's discuss the nuts and bolts of this approach.
Let's check the spawn_drone function. It takes as a parameter a function which operates a new drone. I have 32 drones and 32 columns. So it's necessary to create a tool for the creation of these parameterized drone functions. Let's do it with the "factory function" approach. It takes the number of a column as a parameter and returns a drone function with proper parameterization.
def f(column):
def g():
one_drone_plant_job(column)
return g
A simple function to plant all sunflowers in a given column:
def one_drone_plant_job(column):
n = get_world_size()
for y in range(n):
nav(column, y)
if can_harvest():
harvest()
if get_ground_type() == Grounds.GrassLand:
till()
plant(Entities.Sunflower)
And, last but not least, the code for the manager
Let's create a list of tasks
tasks = []
for t in range(get_world_size()):
tasks.append(t)
Dispatch them between drones
while True:
tasks = []
for t in range(get_world_size()):
tasks.append(t)
while tasks:
t = tasks.pop()
while num_drones() == 16: # max_drones():
pass
spawn_drone(f(t))
This "pass" loop is an "interprocess synchronization" tool that uses the number of drones as a synchronization object. When the "while tasks" loop finishes, the job is done - the field is full of sunflowers.
Code on github
Programming flood
To discuss programming tasks in comments
To discuss programming tasks in comments