Algorithms. Physics. Mathematics. Machine Learning.

Models. EML in ML

« prev|content|next »

Three weeks ago hype erupted around the EML function. Three articles appeared almost at once. The first stated that all we need is a "two buttoned calculator", one button for "1", one for "EML" and that's enough to express everything we need to describe the world around us. Here EML(x, y) = exp(x) - ln(y). The second applies this approach to model battery charge cycle, the third - mixes traditional neural-network elements with a new gate.

I have mixed feelings here. On the one hand, intuition says that to be useful, these trees have to be really high. Like, 10 or 20 nodes. Too much for brute force. On the other hand, usual optimization methods do not just work out of the box. Backpropagation on tree structures - no. Gradient descent over a correct bracket sequence - no. To be honest, even conversion between EML notation and usual symbolic notation is a pain in the a... neck.

Then I had a thought. An EML expression is an expression like eml(eml(1, x), 1). First of all, we can see that there is always "eml(", so we can just use "(". This way it becomes "((1x)1)". Looks interesting. In the one-variable case, we have only four token types to encode all EML expressions: (, ), 1, and x. Would it be beneficial for transformers to have only four tokens to predict? Interesting thought, but not for this post. Maybe later...

But now we are ready to understand the cryptic expression from the previous post. ((1 x) ((1 1) x)) is a function, encoded by EML notation. You can decipher it recursively. (1 1) is (exp(1) - ln(1)) = e. Then ((1 1) x) = (e x) = exp(e) - ln(x) = e**e - ln(x). And so on.

I conducted an experiment. I took the Titanic dataset and normalized all features to be in the range [0, 1]. Then I ran selection of 3-level EML trees with the best ROC AUC over pairs of features. On the top of these candidates, I ran a genetic algorithm to find the best combination by ROC AUC and logloss. The result is an EML tree which solves the Titanic dataset almost as well as the specialized CatBoost package does.

This exercise is for joke only, because this algorithm is extremely computationally expensive and can't be scaled at all.

But I totally think that it was worth it. As a result, we have something as precious as a talking frog: an EML tree which solves the Titanic dataset, and the understanding that notation like (1(x (x (1 1)))) enumerates mathematical expressions and therefore, ML models.

👉👉Know someone who likes math, ML, and weird experiments? Share this post with them!👈👈

To check:

🛸 EML tree for Titanic
(((ps_log ((sa_log sf_auc) ((ps_prob sf_prob) sx_fs_sx))) (sf_prob sa_log)) ((sf_prob ((ps_log (ps_log pc_ag_mx)) (ps_prob (pc_ag_mx pc_fr_mx)))) (sf_prob ((sf_prob (sa_log sx_fs_sx)) sx_fs_sx))))

🤿 GitHub - code of EML solving the Titanic

👍4

498 views15:21

SCI BOT

Have you ever wanted to chat with an LLM that has access to a vast body of scientific knowledge? Now you can

The famous archive that gave free access to tons of scientific papers now lets you chat with them via a bot.

🔥3

275 views08:36

Algorithms. Physics. Mathematics. Machine Learning.

Cyberpunk we deserved

👉 There was a problem during the RL stage of ChatGPT 5.5, and now the system prompt has to include an instruction suppressing talk about goblins and raccoons.
👉 There is a Factorio mod that turns the game into bureaucracy. Biters bring you complaints, and you have to process them.

185 views09:36

Algorithms. Physics. Mathematics. Machine Learning.

AUROC clearly explained

« prev | content

The worst thing you can do is start explaining ML metrics to stakeholders. I did it a lot. Don’t ever do it. Seriously. But here, in our cozy space with physics, mathematics and ML, we can be luxurious and start discussing how to measure the quality of a model, as promised some time ago.

Our setup:

👉 target — the thing we are predicting: survival — binary — 0/1, false/true
👉 model — any function you can imagine; takes all features, returns the Score
👉 Score — a number; small ~ false (0) target, big ~ true (1) target

In our informal language, we calculate the score for each passenger using the chain: passenger → features → model → Score. There is a question: whether this Score works, or what? How do we understand it?

Let’s cheat. For a moment, let’s use survived as the Score. Then ask passengers to form a rank by increasing score. Obviously, all survived people would be on the right end of this rank, and deceased people — on the left side. There would be a position in this rank where we can split it into two crowds. In the left crowd, for each passenger target = 0, and in the right — target = 1.

It gives us the idea. We ask our passengers to form a rank guided by score increase. Then we move from right to left and count survived and deceased. Just counting is not enough. At the end of the day, we would get only the number of passengers in these two classes. We need to memorize the dynamics of these two counts. But how?

A totally genius idea is to use these two meters as coordinates on the plane. Therefore, a survived passenger urges us to make a step up, and a deceased passenger — to make a step to the right.

If we cheated, we move all the way up, then all the way to the right. If passengers were randomly shuffled, we move in a narrow band near the diagonal of the rectangle.

Last step — scale the axes to transform our bounding rectangle into a square. Here we are. The coordinates are TPR and FPR, our trajectory is ROC, and the area under it is AUROC — Area Under the Receiver Operating Characteristic Curve.

I totally believe that you heard all this stuff. But I need a solid base for the madness which follows.

234 viewsedited 15:06

Algorithms. Physics. Mathematics. Machine Learning.

Friday shitposting. 1 May

Thirty-five years ago, in April 1991, I started this project.

The idea was quite simple: use the DRAW operator in QBASIC to draw a stylised “1 MAY” banner. The DRAW operator offered only a small palette of graphical possibilities, and my drawing skills were far outmatched by my ambitions — so the project stalled.

Now I think that actually completing a task is often more important than doing it flawlessly. So here we are: the First of May greeting is finally ready.

A few words about the DRAW operator language. It uses a state-machine-style drawing approach with a traced cursor position. Initially, the cursor is in the centre of the screen. Then there are movement commands: R, U, L, and D for right, up, left, and down, plus E, F, G, and H for the 45-degree directions.

For example, U10 moves the cursor 10 points up and draws a segment. BU10 moves the cursor up without drawing. P11,15 paints the current area, flooding it with colour 11 until it reaches areas coloured 15. There are a few more details in this tiny language, but you get the idea.

For me, 35 years ago, it was quite a revelation that there could be a programming language inside another programming language. Nowadays, DSLs — domain-specific languages — are a powerful tool that every professional engineer should have in their tool belt.

It is also worth mentioning that the languages used by 3D printers and similar machines are quite close in spirit. It would be totally possible to convert this program into a 3D-printer version and print out the title.

Maybe one day...

Feel free to check out my project online - It runs in QBJS, a browser-based QBASIC environment implemented in JavaScript.

P.S. Almost an LLM-free post. The program is hand-written, and the screenshot is honest. Only the text of the post was polished.

🔥1

299 viewsedited 12:56

Algorithms. Physics. Mathematics. Machine Learning.

Pi !

While I'm preparing a post on random nature of ROC, let's celebrate this point. I was wondering which next number would trigger me, tried to check perfect squares, cubes... Nothing clicked. But today I saw 314 and it was like meeting an old friend.

It's totally irrational. And I'm not talking about Pi. Though it is totally irrational. Now I'm talking about my feelings. I don't know why, but this 314 always catches my attention. In school I managed to recite the first 23 decimal digits.

One of my schoolmates once stunned me. He totally believed that Pi = 3.14 15 16 17 ...

The collection of facts and warm memories seems to be endless. Donald Knuth decided to version his famous TeX program by revealing more and more digits of Pi, because that's how path to the perfection works. There is a serious biological article where biologists discovered a stunning property of anthill have circumference "approximately three times" larger then the diameter of the heap.

👍1

237 views11:20

Algorithms. Physics. Mathematics. Machine Learning.

Friday shitposting. Old Moscow photos.

It seems that I’m stuck in a deep perfectionist loop, endlessly polishing my post about stochastic wandering. Meanwhile, let me share some trash photos I took about 20 years ago.

😁1

133 views12:42

About

Blog

Apps

Platform