Not boring, and a bit of a condescending prick

A random idea, but it's about time for me to emit it into the wild.

Project (startup?) idea: An instrumented UI-first testing tool based on computer vision.

Effectively, Selenium but for everything non-browser.

Cucumber-style test descriptions, plus a set of pre-trained models that understand commands such as:

‣ Press the red "OK" button in the lower right corner.
‣ Scroll the top-level pane down until "Solution" appears.
‣ Wait until "Build succeeded" or "Build failed" has appeared on the screen.

Intended reference use case: DevEx teams testing the company-wide IDE setup, so that a new version of some PyCharm / VS Code plugin does not break some important flow.

"Page" load time testing: see how long an app, say, the IDE, takes to load until it is fully functional. I.e. until it can open the project, open a source file in it, and have code completion work with a trivial test case.

Also, works well for performance testing too. For example, to test responsiveness of some team messenger apps, open the same chat window (or a shared doc) from different VPNs around the globe, type the next character only after the one added by the "previous step" agent makes it there.

Would be quite big these days, huh?

398 views00:17

345 views18:15

Not boring, and a bit of a condescending prick

Just clicked "Publish" on two posts that were in the works for well over a month. Behold:

• Distributed Stateful Workflows, and
• Stateful Orchestration Engines.

Hope you enjoy reading them as much as I enjoyed writing them. And spread the word!

Dima Korolev

Distributed Stateful Workflows

The theory and the concepts behind modern-day distributed software.

409 viewsedited 11:42

Not boring, and a bit of a condescending prick

Today I learned that for quite a while I was using the terms "data center" (DC) and "availability zone" (AZ) wrong!

My (incorrect!) understanding sincerely was that there may well be more than one AZ in a single DC.

Heck, I would buy it that there could be more than one AZ in a single building of a DC, or even on a single floor of this building.

Because, in my book, if the reserve power supply for this building/floor is separate from the rest, and so is the building-level/floor-level network switch, this truly is its own, well, _availability_ zone, right?

Moreover, I sincerely believed that even within AWS it's perfectly feasible for the DevOps team to want to configure a, say, three-nodes Zookeeper cluster such that each of these three nodes lives in its own "AZ", well within the same data center.

Because it just makes sense, doesn't it? If anything, I'd want these three nodes:

1) To be next to each other, for low latency between them, and yet
2) Not in the same "risk group", for their failures to be uncorrelated.

By uncorrelated failures I mean that in the case of top-level switch failure, or a local power outage, I would like my setup to be far more likely to lose just one of these three machines, compared to losing two, or all three at once.

Well, apparently, intuitive as it sounds, the above explanation is demonstrably wrong, at least in the case of Amazon Web Services. In AWS there are regions, then availability zones, and finally data centers. In this order of traversing the hierarchy top to bottom.

328 views14:35

Not boring, and a bit of a condescending prick

My Monday begins (or, rather, continues from Sunday) by PyPl not accepting a package.

How difficult is it to have a trivial tutorial of one-liners with no external dependencies? Last time I pushed my code to a registry, it was npm back then, the whole thing took just a few minutes.

Worst thing is, even when it will work — and I assume it will some time soon — I can not fully trust it, since I have no clear picture of what all the underlying components underneath are doing.

Why not have some standard solution with standard authentication and live with it? There's a password, an OTP code, an API token, OpenID connector — and all of these are effectively security holes waiting to be exploited.

In the world where PGP and SSH are widespread and well-established technologies for dozens of years. Just ssh with your key. Get a fake shell where it prompts for the OTP. And you're done. Plain and simple, with no room for error. Much like one can ssh into ssh@github.com to test their key — beautifully clean and reliable.

Sigh. Gotta got used to the fact that my standards are higher than most. On the other hand, that's a hint to what area should I be working at: improving such processes for other engineers!

245 viewsedited 11:59

Not boring, and a bit of a condescending prick

https://x.com/tsoding/status/1759199337685389760

I can relate.

216 views21:16

Not boring, and a bit of a condescending prick

My relationship with cmake was always a love-hate one, but it started to bounce back towards love recently.

After all, given its history and its legacy, it's still a surprisingly standardized and well-supported tool. If building on top of make in a clever way would be my path of choice, I could not have done better.

Besides, CMake-based projects are supported by multiple IDEs today, including VS Code, which I personally believe is fantastic.

One thing that I found cmake lacking — and hope the community of people smarter than me have thought about it before — is adding my library's include directories under some fixed ""dir name", regardless of what's the name of the dir my code is cloned into.

Most projects I've seen have in their root directory the CMakeLists.txt file, and also src/ and possibly include/. Say I maintain my_project. Ideally, I want its users to #include stuff as "my_project/my_header.h", or "my_project/include/my_header.h". However, this would require my_project to exist in its own directory called, well, my_project.

This is too much to ask in 2024, I'd say. Symlinks can solve this problem, of course, but only up to a certain degree.

What do people do?

Imagine I have a large project and it does have a collision in header names of dependencies; some "common.h" in my_dep_1 and my_dep_2. What's the canonical way to use them so that:

• The setup uses the "vanilla" versions of "my_dep_1" and "my_dep_2",
• The writers and readers of the code can clearly disambiguate whether we're looking at "common.h" from one or the other dep,
• The overall setup is not fragile, i.e. it works under various IDEs on various platforms, and
• The solution is generic, so that no changes need to be made on the side of "my_dep_1" and "my_dep_2" to use these deps in a clever way.

I can write my own dev setup script, even inside CMakeLists.txt, which will make sure that whatever directory is passed as -I to my code contains "my_dep_1/" and "my_dep_2/", if only as symlinks. This is the cleanest way I can think of, but it's also an overkill to my taste.

Would love to hear from people more experienced than I am. Thank you!

268 views08:14

Not boring, and a bit of a condescending prick

How do contributor license agreements work these days?

One thing I would particularly like to have is that it’s the existing community that manages the lineage of the project.

So that if someone (even the original author!) leaves, this person can not just say “I’m deleting all my code from this project”.

I mean they can. They can send such a pull request. They could even merge it in, perhaps even with some approvals they may well gather with their farewell message.

But then the community should just be able to put that code back in and continue to function as usual. We keep the name of the original contributor (on an altered name if they wish, no problem). But if they broke the functionality of the project by removing their code, I’d say keeping the project functional has precedence over their desire to have their code removed.

Thus, I believe, it should be part of the contribution license agreement that one consents to their code being used in the project even if decide to withdraw themselves, to be excluded, or even removed from the project against their will. Effectively, they attest that not only they have the right to contribute, but they also grant the community the right to use their code even if at some point later on they may want to withdraw parts or all of their code.

Is this a reasonable position to hold?

Can this position be incorporated into the CLA in a language that does not allow for ambiguity?

Are there real life examples of such CLAs, ideally with proofs of them holding against someone’s desire to have their code removed in a non-non-breaking way?

269 views14:56

Not boring, and a bit of a condescending prick

Heh, while I am not a huge fan, I used to praise Python for "sensible defaults". Consider a trivial if-condition:

if something:
    foo()

It checks something for a boolean that is False, for a zero, for an empty string, empty array, etc.

Lovely, isn't it?

Well, today I was writing a script that should behave differently depending on whether a config.json file is present in a directory or not. The code uses a Pythonic-clear construct of try-ing with open(...) as file and json.loads().

For a negative unit test (yes folks, death tests FTW!), I have, of course, used a definitively non-empty config.json. The contents of which were {}\n.

You can guess the rest.

Took me a couple dozen seconds to formulate the right hypothesis, and a few minutes to make sure it's isfile() that is responsible for the condition, not "just" if config_json:.

Obligatory pic attached.

243 views19:16

Not boring, and a bit of a condescending prick

My new hobby is asking ChatGPT 4o to generate maps as pictures.

It requires some prompt engineering — it can even generate real maps with pins using Python modules now! — but I generally succeed by just asking nicely.

"Just generate the picture with that map please. You're good at generating images, and the map is an image, isn't it?"

The amount of hallucinations there is insane. I won't be comfortable sharing the Congolese wars picture it made for me, since that one was brutal — a good half of Africa was fighting.

Here are "Vietnam tourist destinations" though. Enjoy responsibly.

#nottraveladvice

258 views20:59

Not boring, and a bit of a condescending prick

I don't really trust Telegram as the platform, and people often wonder why.

To me it's not about open source or not open source. Business models come first because it's the business model that keeps the product afloat, to begin with. And for some products going open source is just too bad to consider seriously. Signal is not, after all, open source.

And it's not about the lack of transparency. Telegram is quite transparent about what they do and how they do it.

What I do not understand — since it does not compute in my head — is how can the platform BOTH declare it wants to be resistant with respect to privacy AND at the same time be so reluctant to enable external add-ons for encryption.

Say, you and I want to communicate 1:1, and we have exchanged our public keys before.

It doesn't take a genius for both of us to run a trivial, tiny, open-source daemon on a dedicated device to encrypt/decrypt messages sent specifically to us on the fly.

So that when I'm sending you a message it's my laptop that encrypts it with your public key, so only you with your private key can decrypt it. And vice versa.

If it's a multiple persons chat, only the people who I trust to have their public keys in an uncompromised way will be able to see messages sent by me. Because man-in-the-middle is real.

My point is, such a feature, with encryption/decryption of particular chats taken out to a dedicated device, would a) absolutely not compromise Telegram's position on the market, and b) unambiguously send the message that Telegram does indeed care about user privacy.

TL;DR: There should be a way to buy a Raspberry Pi, join the same home WiFi network, have some one-time QR code for this Pi to act as my encryption device for certain Telegram chats, and run a tiny, small, self-contained piece of open source software on this Pi specifically to keep being locked into the Telegram ecosystem while having the confidence that my private chats are private.

Heck, this "tiny piece of open source software" is something an advanced Linux user can put together in half an hour in bash by just invoking openssl commands. Asymmetric encryption is literally the most trivial thing that exists in the world.

If Telegram truly wants to compete with Signal on the privacy front, the solution I am outlining, both from the product side and from the tech side, is literally single-digit days to build and ship for the engineers. So, single-digit weeks to build and ship for Telegram as a company. But, somehow, everything but this solution is being developed and shipped.

~ ~ ~

I'll go further and postulate that the approach I am outlining should be the future.

Some 15 years ago I was wondering why isn't there a "login solution" where I scan some QR code with my phone to open a one-time session on some untrusted and semi-trusted device. Fast forward to today, and that's exactly how Whatsapp and Telegram are used in the browser.

Same with OTP codes. There are open source apps fully compatible with Google Authenticator. The protocol is open. Exporting a certain code from one device to another is about scanning a QR code. Safe enough for most people's standards.

Nothing prevents us from building the same solution for chats and messaging. Scan the QR code from my device to get my private key. Type my OTP on your device — I'll share one code — to prevent man-in-the-middle attacks. You now know my public key, and your device can encrypt messages sent to my device. Your device may become compromised, but this we have no control over.

But it's more than enough for me to know that your device can be a Raspberry Pi and there exist plenty of open source implementations for the open standard for the above protocol.

197 views19:44

Not boring, and a bit of a condescending prick

Corporations and otherwise enterprise companies may even purchase and issue hardware tokens to their employees. This hardware token can be USB, or WiFi, or Bluetooth, or even NFC. So that the people can use their own devices, since this token has limited throughput by design. And your "corporate Slack" is super secure, since your users can not physically extract the decryption key from this hardware token. Traveling through unsafe countries? Leave this token at home; we'll issue you a duplicate one on the premises of this company in your ultimate country of destination.

Such a trivial problem. Such a trivial solution. And Telegram is so well-positioned to build it.

In fact, Telegram may even offer this as a premium / enterprise option. Everything, including supporting those hardware tokens. Give people the open to buy those tokens from other vendors, since the protocol is open. And charge some $10 per user per month, far less than what Slack wants. And have these users remain loyal Telegram users, while the company that is paying Telegram has 100% confidence that the messages sent through Telegram are safe and secure.

Everyone I speak with about this in depth agrees with the above train of thought within single-digit minutes. OpenSSL is, well, open. But there is no product that connects the above dots. Even though the "custom device" cryptographic boundary is an extremely trivial idea that also plays well with the concept of open source encryption protocol plus 100% vendor lock-in with Telegram as the app of choice.

The only reason I used to believe can explain why Telegram is not doing the above is App Store regulations. The big players (Apple & Google & Samsung) may well decide to not allow such an app, for whatever BS reason such as "we can not ensure our users do not view adult content there". But since I've learned Telegram has special versions of its own app that are not distributed via the stores, this argument no longer holds.

So: Either my first-principle predicion will begin happening soon, hopefully with Telegram, or I'd be willing to buy into various conspiracy theories about the rules and the means to support "privacy" on the Internet.

243 views19:44

Not boring, and a bit of a condescending prick

244 views11:36

Not boring, and a bit of a condescending prick

What I'm about to say might sound completely absurd, but I believe it has a chance of touching upon some deeper wisdom. I can't seem to get it out of my head, so here it is.

In English, when someone hears a word like "bakery", I think it tends to carry a personal, or at least to a certain degree personified connotation.

For example, the phrase "we went to the bakery the other day" is akin to saying "we went to the baker's place", implying that we visited the spot where the baker practices their craft. While it's likely we bought something at this bakery, it also suggests that we exchanged our money for the goods and services provided by the baker or by someone representing the baker.

Even though it's subtle, there's a sense in which the bakery is seen as something the baker owns or is closely tied to. The baker might run the business with their family, be involved in a partnership, or still be paying off a loan for the bakery. Perhaps the baker is working there to gain experience, save money, and build a customer base before opening their own bakery. The point is, "the bakery" is personified by "the baker". When you buy from the bakery, you're supporting this baker, who, in turn, contributes to your family, the neighborhood, and the community.

Looking at it another way, if two hundred years ago you were to learn, say, that your neighbor's teenage child was going to work at the bakery next door, you'd likely assume they'd be working directly with the baker. If this teenager is bright and/or comes from a family with a history of being business-minded, you might even assume they're considering a career in baking. In five to ten years, they might want to become a partner in the bakery, and in ten to twenty years, they could potentially open their own bakery, buy out the one they are working in now, or even be chosen by the current owner to continue running his or her family baking business.

Based on my understanding of the language and culture, I believe this is what goes through the mind of an English speaker when they hear the word "bakery". I would even go as far as to say that an American conservative, for example, might hear all the right things about supporting small businesses when they hear of "the bakery." To them, "the bakery" might be seen as the opposite of a large corporation, such as Walmart.

I also think that the German understanding of the term follows similar lines. The bakery is seen as the baker's craft, and one's craft is viewed as the business that they manage for themselves and their family.

Now, hear me out. In Russian, and I assume in other Eastern European languages, the personal connection between "the bakery" and "the baker" just doesn't exist. There's none. Zero. "The bakery" is just a term for a place where baked goods can be bought. That's it.

For an English speaker, there's a sharp difference between buying strawberries at a grocery store versus a farmer's market. There's a personal touch to a farmer's market; the strawberries you buy there have a certain quality. There's even a sense of individuality—for example, you might know that Aunt Jane grows, picks, and stores her strawberries in a particular way. In contrast, when you buy strawberries at a grocery store, the personal touch is lost, and you're lucky if there's a recognizable brand name like Driscoll's.

For a Russian speaker, both linguistically and culturally, there's no person and no individual traits associated with "the bakery". The owner feels infinitely distant, and a Russian speaker might refer to them as "they". If there are two bakeries or two grocery stores to choose from, a Russian speaker might think along the lines of "they know how to select their suppliers better at A than at B". There wouldn't be any personal touch involved. If a teenager were to work at a bakery, learning from "the baker" would likely be the last thing on their mind. In fact, the concept of "the baker" as a figure has long disappeared.

208 views17:11

Not boring, and a bit of a condescending prick

I'm not sure what to make of this argument, and I don't want to frame it in terms of "state-provided" or "state-supplied" bakeries. But, in a way, this is exactly what these places have become in the "collective consciousness", at least linguistically.

Such a situation is doubly unfortunate. First, it removes the spirit from the craft, depriving people of something to dedicate themselves to. Second, it eliminates any notion of voting with one's money. Instead, people's decision-making becomes solely about getting a good deal and enjoying the process and the results, with no thought about supporting the right kind of businesses or thinking about the community's future, which is shaped by the collective actions of those who vote with their feet and their wallets.

Please help me refine this thought. First, am I entirely off-base, or does this resonate with you? Second, how relevant do you think this idea is in today's world? Finally, does this perspective relate to the modern approach to high-tech software businesses?

246 views17:11

Not boring, and a bit of a condescending prick

20 years ago.

236 views17:28

Not boring, and a bit of a condescending prick

In a sad coincidence of my last three post one is about The Pirate Bay and one is about Telegram and its encryption.

And now Pavel Durov, the founder of Telegram, is arrested in France. Presumably, because France believe he's not cooperating with the authorities with respect to "allowing" "bad things" to happen on Telegram.

So, had end-to-end encryption, esp. with user-owned tokens, been enabled in Telegram by design since a long time ago, this very accusation could not have possibly emerged in the first place. Ah, and especially if the protocols are open source, with canonical implementations, reviewed, or perhaps even implemented, by third-party experts.

Sigh. Wish this situation resolves itself quickly enough in the most positive way possible.

207 views21:22

Not boring, and a bit of a condescending prick

Encryption is not a human right. Encryption is something far bigger.

I can explain this with an absurd analogy. Imagine a country making it illegal for a human being to be strong enough to lift more than a hundred pounds.

Depending on how much sci-fi you read and how vivid your imagination is, Orwellian scenarios of various depths and shapes are likely emerging in your mind right now. You might envision licensed lifting equipment with weight caps, raids on illegal gyms, and various traps to expose people ("suspects") who are physically capable of lifting. There might even be mandatory medical testing to forcibly drug those whose muscles are strong enough.

Undoubtedly, significant pressure can be applied top-down, and the percentage of people capable of lifting over a hundred pounds can be substantially reduced.

However, it's absurd to believe this percentage can be reduced to zero. Many people would still want to be strong, if only to maintain their health or to attract better partners. Furthermore, I'd argue that if a country were to pursue this route, its days would be numbered. Within two generations, three max, it would turn into a failed state. Neighboring countries would likely welcome free-thinking individuals who realize their desire to control their bodies trumps their desire to conform to some "civic conduct" they don't agree with.

The term "human right" is a social construct. We, the humans, decide what human rights are and how we protect them. Human rights are not static, and it's up to us to allow the term to evolve as our civilization grows and matures.

There will always be disagreements about what exactly constitutes a human right, what the acceptable means to defend these rights are, and what liberties we should collectively consent to give up, and to what extent, to protect our human rights.

But much like the absurd analogy of capping the weight one can lift, it's foolish to try to cap one's right to private communication.

Totalitarian governments can make their citizens believe certain historical events never happened. We have evidence that this can be done quite effectively — hundreds of millions of people can be made to act in a way that provides them with sufficient plausible deniability about not knowing that certain events occurred.

But encryption is different.

The Diffie-Hellman key exchange handshake was discovered, not invented. Number theory works regardless of whether one believes in it. Strong algorithms exist that can literally be evaluated on a napkin, which can then be burned. People from all age groups and demographics are increasingly aware of asymmetric encryption these days — some from SSH-ing into servers, and an ever-growing number from interacting with crypto wallets.

This is why encryption is not a human right. It is far more significant than a human right. It's akin to gravity: a concept that exists in the universe, which no sane ruler's normative act can turn off at will.

214 views13:45

Not boring, and a bit of a condescending prick

I repeat: do not ask ChatGPT for illustrations where the concept of correctness exists.

This is just ... a student should be expelled from a university if they present this picture.

What's confusing is that the explanation ChatGPT gave first, to which I answered "yes" as it suggested to generate the picture, was quite decent.

239 views20:40

Not boring, and a bit of a condescending prick

My experiments-turned-tutorial post on C++20 coroutines: https://dimakorolev.substack.com/p/c20-coroutines.

245 views13:42

Not boring, and a bit of a condescending prick

Could not resist running this experiment, and then could not resist turning it into a post: "futuristic" posters generated by ChatGPT 4o.

I'd say we are indeed losing the capacity to dream big.

Prove me wrong. Please.

Dima Korolev

The Future as Envisioned by the AI

"We interrupt our scheduled programming for this important message."

174 views21:29

About

Blog

Apps

Platform