Algorithms. Physics. Mathematics. Machine Learning.

Friday shitposting. 1 May

Thirty-five years ago, in April 1991, I started this project.

The idea was quite simple: use the DRAW operator in QBASIC to draw a stylised “1 MAY” banner. The DRAW operator offered only a small palette of graphical possibilities, and my drawing skills were far outmatched by my ambitions — so the project stalled.

Now I think that actually completing a task is often more important than doing it flawlessly. So here we are: the First of May greeting is finally ready.

A few words about the DRAW operator language. It uses a state-machine-style drawing approach with a traced cursor position. Initially, the cursor is in the centre of the screen. Then there are movement commands: R, U, L, and D for right, up, left, and down, plus E, F, G, and H for the 45-degree directions.

For example, U10 moves the cursor 10 points up and draws a segment. BU10 moves the cursor up without drawing. P11,15 paints the current area, flooding it with colour 11 until it reaches areas coloured 15. There are a few more details in this tiny language, but you get the idea.

For me, 35 years ago, it was quite a revelation that there could be a programming language inside another programming language. Nowadays, DSLs — domain-specific languages — are a powerful tool that every professional engineer should have in their tool belt.

It is also worth mentioning that the languages used by 3D printers and similar machines are quite close in spirit. It would be totally possible to convert this program into a 3D-printer version and print out the title.

Maybe one day...

Feel free to check out my project online - It runs in QBJS, a browser-based QBASIC environment implemented in JavaScript.

P.S. Almost an LLM-free post. The program is hand-written, and the screenshot is honest. Only the text of the post was polished.

🔥1

437 viewsedited 12:56

Pi !

While I'm preparing a post on random nature of ROC, let's celebrate this point. I was wondering which next number would trigger me, tried to check perfect squares, cubes... Nothing clicked. But today I saw 314 and it was like meeting an old friend.

It's totally irrational. And I'm not talking about Pi. Though it is totally irrational. Now I'm talking about my feelings. I don't know why, but this 314 always catches my attention. In school I managed to recite the first 23 decimal digits.

One of my schoolmates once stunned me. He totally believed that Pi = 3.14 15 16 17 ...

The collection of facts and warm memories seems to be endless. Donald Knuth decided to version his famous TeX program by revealing more and more digits of Pi, because that's how path to the perfection works. There is a serious biological article where biologists discovered a stunning property of anthill have circumference "approximately three times" larger then the diameter of the heap.

👍1

391 views11:20

Algorithms. Physics. Mathematics. Machine Learning.

Friday shitposting. Old Moscow photos.

It seems that I’m stuck in a deep perfectionist loop, endlessly polishing my post about stochastic wandering. Meanwhile, let me share some trash photos I took about 20 years ago.

😁1

321 views12:42

Algorithms. Physics. Mathematics. Machine Learning.

Stochastic AUROC

« prev | content | next »

Five previous posts were a preparation for this one. We prepared age model which uses only age factor. The model gives us the Score with low cardinal number because there are only three values: -1, 0, 1. And we plotted ROC and calculated AUROC.

The question which has bugged me for eleven years, since the moment I heard the ROC and AUROC definitions in the School of Data Analysis, is: "How come there are only three points on ROC, when, by definition, there should be at least 891 points?" The magic number 891 is the number of passengers in the Titanic dataset, and three is the number of different model answers.

Let's start thinking from first principles. What is our ROC approach? To assign a score, to make a rank. But if we have only -1, 0, 1 values, the rank looks more like three unorganized crowds. The nature of these crowds shows itself when we are applying our algorithm: "move from the highest values of scores toward lower". When all scores are different, we have a strictly determined order. When a group of passengers has the same score, we can only pick randomly from this group. The target value for each passenger guides our path over the TPR-FPR plane. So, now we have random wandering.

Let's check the picture. Blue line - we were EXTREMELY lucky and picked passengers from the crowd in the right order: survived first. Orange - opposite extreme, deceased first. The lines close to the diagonal of the rectangle are random trajectories.

There are two points where all lines intersect. Their coordinates are determined by the number of survived and deceased passengers in each group. The number of survived passengers is the height of the rectangle, and the number of deceased passengers is the width.

Phew. Here we are. I hope it is clear now that, when we have a small number of different values from the model, we don't have enough information to rank objects - passengers - deterministically. The sequence which guides ROC becomes stochastic, and ROC becomes a random walk trajectory.

I think it is important because:

👉 This approach closes a serious gap in understanding ROC properties for a wide class of applications. For example, we have a binary feature and a binary target. What is ROC in this case? Now we know.

👉 The cardinality of the model's output is a property which is frequently ignored. But it can be either accidentally blown up by noisy calculations, or reduced by aggressive quantization. We have to be aware of these effects.

291 viewsedited 13:14

Algorithms. Physics. Mathematics. Machine Learning.

Hardcore. Stochastic wandering.

« prev | content

My own investigation

When I started all this commotion, my intuition was as simple as:
👉 tie in score → stochastic wandering on rectangular grid → 0s and 1s for up/right moves → binomial counting → we can calculate everything

Indeed, if we want to make p steps up and n steps right, we have C(p+n, p) routes - school program. To get distribution over AUC you have to write a function which calculates it from "001101" sequence. Suddenly Young (or Ferrers) diagrams appeared. Bars under the curve increase their height, therefore each zero-one sequence encodes a correct Young diagram. Height of column is number of ones to the left of our current position, we are to sum it up when hit zero. So, AUC numerator is sum of #(all ones to the left of each zero) (column based count) or sum of #(all zeroes to the right of each one) (row based count).

If you meditate long enough, you can understand that AUC numerator is exactly number of correctly ordered pairs (1 on the left side of 0). AUC itself is this number divided by p*n.

Here my mathematical skills left me alone and I dove into the topic with ChatGPT.

Vibe science

ChatGPT gave me closed-form expression for AUC numerator variance so quickly, that I decided that this result is trivial. Big mistake. I found myself in the middle of complex and beautiful mathematics. Vast collection of names and facts started to bury me. So I picked up the most bright and interesting one. Let the show start!

👉 There is a mathematical object which addresses exactly the question we stated: how many paths there are with n negative samples, p positive samples and AUC numerator = q.
👉 Its name - Gaussian (q-binomial) coefficient
👉 Math expression is on the picture
👉 Gaussian coefficient is a polynomial and numbers of paths are coefficients of this polynomial
👉 It's a q-analog of binomial coefficient: when the polynomial variable is set to 1, gaussian coefficient becomes ordinary binomial coefficient
👉 Distribution is unimodal, with maximum around p*n/2
👉 Center of distribution: q = p*n/2
👉 Variance of q: Var(q) = p*n*(p+n+1)/12
👉 Variance of AUC: Var(AUC) = (p+n+1)/(12*p*n)
👉 Mode of distribution corresponds to the central coefficient(s) of the Gaussian coefficient. Sometimes it is one coefficient, sometimes there is a plateau.

Practical implications

👹 There are some packages which can calculate confidence intervals for ROC AUC, they are not popular
👹 The approach can be useful for feature selection and experiments with low-cardinality-output models

That's almost it. I think it would be useful to collect all references I found in one more post.

300 viewsedited 18:59

Algorithms. Physics. Mathematics. Machine Learning.

Friday shitposting.

My own photos. The first one demonstrates that it is not always a good idea to show users a randomly generated sequence of letters. The second... Basically, I have no clue. Probably a secret training site for ninjas?

❤4😁2

270 views11:28

Algorithms. Physics. Mathematics. Machine Learning.

What should the next post be?

Anonymous Poll

29%

List of references for stochastic wandering: wiki, articles, schoolbooks

29%

3D-printed model which proves the formula for the sum of the first x squares

58%

My humble introduction of myself: creepy self-portrait plus creepy description of my (creepy?)person

31 voters145 views06:57

Algorithms. Physics. Mathematics. Machine Learning.

Self-portrait

To generate such a magnificent picture, say: "Redraw my photo in MS Paint style, with some likeness, but also off in a confusing way..." I'll try to write about myself under a similar prompt.

The beginning
Trushin Arseniy. Born in the USSR. Rotary phone, 3 Soviet kopecks per bun, 5 per metro ride, 48 for a big chunk of ice cream. Favorite reading - Soviet encyclopedia for youngsters, sister's schoolbooks. Tons of DIY stuff, small "Quantum" magazines. Father - radio amateur. Diodes, LEDs, batteries and lamps - favorite toys.

Science
Ordinary school, special class in the same school, phys-math brilliant macabre: school 54 between Frunzenskaya and Sportivnaya. First own computer - Soviet, 1992. Sura PK 8500. Now I wouldn't dare solve the problems I solved on that computer. Like prove that (1+e) creates a ring if e is a 17th root of 1. I vaguely recall this task, but I can recall that I solved it numerically, and the key point was a page from a student notebook, covered with digits and letters.

PhD in laser physics, 14 articles. Then - change of course. First interview - as a C++ developer. Then: Cadence, Yandex, Equifax, HeadHunter, App in the Air, Yandex again.

The best feedback on my educational activities: "Arseniy Sergeevich, when you explained physics to us, everything was unclear, but very interesting. With our current teacher it is unclear as well, but not interesting anymore."

9 students: MSU, MIPT, RAO. +1 article on ML.

MISC
English - my curse. Always B in school. This blog helps me maintain my current level. I write it from my head, then polish with ChatGPT.

This blog. You know, just working kills me. Even when I implemented my own idea, it was a fascinating year, but it was hard to push it 8x5. For me the panacea is to have something I'm thinking about just for fun. Because we can. This channel is a spaghetti of such thoughts. You can see that there are threads: about TFWR, GBDTE, agents... Some are moving, some are on pause.

Pelevin? Can't say that I'm a fan, but I read everything I have heard about. Of course, Strugatskie. "Monday...", "The Tale...", "A Billion Years..." - my moral DNA. "M&M" by Bulgakov. As the Master put it in "Nine Princes in Amber":

Goodbye and hello, as always.

👍5❤1

94 viewsedited 15:47

Algorithms. Physics. Mathematics. Machine Learning.

Algorithms. Physics. Mathematics. Machine Learning. pinned a photo

15:54

About

Blog

Apps

Platform