Not boring, and a bit of a condescending prick
257 subscribers
39 photos
124 links
Semi-digested observations about our world right after they are phrased well enough in my head to be shared broader.
Download Telegram
How do contributor license agreements work these days?

One thing I would particularly like to have is that it’s the existing community that manages the lineage of the project.

So that if someone (even the original author!) leaves, this person can not just say “I’m deleting all my code from this project”.

I mean they can. They can send such a pull request. They could even merge it in, perhaps even with some approvals they may well gather with their farewell message.

But then the community should just be able to put that code back in and continue to function as usual. We keep the name of the original contributor (on an altered name if they wish, no problem). But if they broke the functionality of the project by removing their code, I’d say keeping the project functional has precedence over their desire to have their code removed.

Thus, I believe, it should be part of the contribution license agreement that one consents to their code being used in the project even if decide to withdraw themselves, to be excluded, or even removed from the project against their will. Effectively, they attest that not only they have the right to contribute, but they also grant the community the right to use their code even if at some point later on they may want to withdraw parts or all of their code.

Is this a reasonable position to hold?

Can this position be incorporated into the CLA in a language that does not allow for ambiguity?

Are there real life examples of such CLAs, ideally with proofs of them holding against someone’s desire to have their code removed in a non-non-breaking way?
Heh, while I am not a huge fan, I used to praise Python for "sensible defaults". Consider a trivial if-condition:

if something:
foo()


It checks something for a boolean that is False, for a zero, for an empty string, empty array, etc.

Lovely, isn't it?

Well, today I was writing a script that should behave differently depending on whether a config.json file is present in a directory or not. The code uses a Pythonic-clear construct of try-ing with open(...) as file and json.loads().

For a negative unit test (yes folks, death tests FTW!), I have, of course, used a definitively non-empty config.json. The contents of which were {}\n.

You can guess the rest.

Took me a couple dozen seconds to formulate the right hypothesis, and a few minutes to make sure it's isfile() that is responsible for the condition, not "just" if config_json:.

Obligatory pic attached.
My new hobby is asking ChatGPT 4o to generate maps as pictures.

It requires some prompt engineering — it can even generate real maps with pins using Python modules now! — but I generally succeed by just asking nicely.

"Just generate the picture with that map please. You're good at generating images, and the map is an image, isn't it?"

The amount of hallucinations there is insane. I won't be comfortable sharing the Congolese wars picture it made for me, since that one was brutal — a good half of Africa was fighting.

Here are "Vietnam tourist destinations" though. Enjoy responsibly.

#nottraveladvice
I don't really trust Telegram as the platform, and people often wonder why.

To me it's not about open source or not open source. Business models come first because it's the business model that keeps the product afloat, to begin with. And for some products going open source is just too bad to consider seriously. Signal is not, after all, open source.

And it's not about the lack of transparency. Telegram is quite transparent about what they do and how they do it.

What I do not understand — since it does not compute in my head — is how can the platform BOTH declare it wants to be resistant with respect to privacy AND at the same time be so reluctant to enable external add-ons for encryption.

Say, you and I want to communicate 1:1, and we have exchanged our public keys before.

It doesn't take a genius for both of us to run a trivial, tiny, open-source daemon on a dedicated device to encrypt/decrypt messages sent specifically to us on the fly.

So that when I'm sending you a message it's my laptop that encrypts it with your public key, so only you with your private key can decrypt it. And vice versa.

If it's a multiple persons chat, only the people who I trust to have their public keys in an uncompromised way will be able to see messages sent by me. Because man-in-the-middle is real.

My point is, such a feature, with encryption/decryption of particular chats taken out to a dedicated device, would a) absolutely not compromise Telegram's position on the market, and b) unambiguously send the message that Telegram does indeed care about user privacy.

TL;DR: There should be a way to buy a Raspberry Pi, join the same home WiFi network, have some one-time QR code for this Pi to act as my encryption device for certain Telegram chats, and run a tiny, small, self-contained piece of open source software on this Pi specifically to keep being locked into the Telegram ecosystem while having the confidence that my private chats are private.

Heck, this "tiny piece of open source software" is something an advanced Linux user can put together in half an hour in bash by just invoking openssl commands. Asymmetric encryption is literally the most trivial thing that exists in the world.

If Telegram truly wants to compete with Signal on the privacy front, the solution I am outlining, both from the product side and from the tech side, is literally single-digit days to build and ship for the engineers. So, single-digit weeks to build and ship for Telegram as a company. But, somehow, everything but this solution is being developed and shipped.

~ ~ ~

I'll go further and postulate that the approach I am outlining should be the future.

Some 15 years ago I was wondering why isn't there a "login solution" where I scan some QR code with my phone to open a one-time session on some untrusted and semi-trusted device. Fast forward to today, and that's exactly how Whatsapp and Telegram are used in the browser.

Same with OTP codes. There are open source apps fully compatible with Google Authenticator. The protocol is open. Exporting a certain code from one device to another is about scanning a QR code. Safe enough for most people's standards.

Nothing prevents us from building the same solution for chats and messaging. Scan the QR code from my device to get my private key. Type my OTP on your device — I'll share one code — to prevent man-in-the-middle attacks. You now know my public key, and your device can encrypt messages sent to my device. Your device may become compromised, but this we have no control over.

But it's more than enough for me to know that your device can be a Raspberry Pi and there exist plenty of open source implementations for the open standard for the above protocol.
Corporations and otherwise enterprise companies may even purchase and issue hardware tokens to their employees. This hardware token can be USB, or WiFi, or Bluetooth, or even NFC. So that the people can use their own devices, since this token has limited throughput by design. And your "corporate Slack" is super secure, since your users can not physically extract the decryption key from this hardware token. Traveling through unsafe countries? Leave this token at home; we'll issue you a duplicate one on the premises of this company in your ultimate country of destination.

Such a trivial problem. Such a trivial solution. And Telegram is so well-positioned to build it.

In fact, Telegram may even offer this as a premium / enterprise option. Everything, including supporting those hardware tokens. Give people the open to buy those tokens from other vendors, since the protocol is open. And charge some $10 per user per month, far less than what Slack wants. And have these users remain loyal Telegram users, while the company that is paying Telegram has 100% confidence that the messages sent through Telegram are safe and secure.

Everyone I speak with about this in depth agrees with the above train of thought within single-digit minutes. OpenSSL is, well, open. But there is no product that connects the above dots. Even though the "custom device" cryptographic boundary is an extremely trivial idea that also plays well with the concept of open source encryption protocol plus 100% vendor lock-in with Telegram as the app of choice.

The only reason I used to believe can explain why Telegram is not doing the above is App Store regulations. The big players (Apple & Google & Samsung) may well decide to not allow such an app, for whatever BS reason such as "we can not ensure our users do not view adult content there". But since I've learned Telegram has special versions of its own app that are not distributed via the stores, this argument no longer holds.

So: Either my first-principle predicion will begin happening soon, hopefully with Telegram, or I'd be willing to buy into various conspiracy theories about the rules and the means to support "privacy" on the Internet.
What I'm about to say might sound completely absurd, but I believe it has a chance of touching upon some deeper wisdom. I can't seem to get it out of my head, so here it is.

In English, when someone hears a word like "bakery", I think it tends to carry a personal, or at least to a certain degree personified connotation.

For example, the phrase "we went to the bakery the other day" is akin to saying "we went to the baker's place", implying that we visited the spot where the baker practices their craft. While it's likely we bought something at this bakery, it also suggests that we exchanged our money for the goods and services provided by the baker or by someone representing the baker.

Even though it's subtle, there's a sense in which the bakery is seen as something the baker owns or is closely tied to. The baker might run the business with their family, be involved in a partnership, or still be paying off a loan for the bakery. Perhaps the baker is working there to gain experience, save money, and build a customer base before opening their own bakery. The point is, "the bakery" is personified by "the baker". When you buy from the bakery, you're supporting this baker, who, in turn, contributes to your family, the neighborhood, and the community.

Looking at it another way, if two hundred years ago you were to learn, say, that your neighbor's teenage child was going to work at the bakery next door, you'd likely assume they'd be working directly with the baker. If this teenager is bright and/or comes from a family with a history of being business-minded, you might even assume they're considering a career in baking. In five to ten years, they might want to become a partner in the bakery, and in ten to twenty years, they could potentially open their own bakery, buy out the one they are working in now, or even be chosen by the current owner to continue running his or her family baking business.

Based on my understanding of the language and culture, I believe this is what goes through the mind of an English speaker when they hear the word "bakery". I would even go as far as to say that an American conservative, for example, might hear all the right things about supporting small businesses when they hear of "the bakery." To them, "the bakery" might be seen as the opposite of a large corporation, such as Walmart.

I also think that the German understanding of the term follows similar lines. The bakery is seen as the baker's craft, and one's craft is viewed as the business that they manage for themselves and their family.

Now, hear me out. In Russian, and I assume in other Eastern European languages, the personal connection between "the bakery" and "the baker" just doesn't exist. There's none. Zero. "The bakery" is just a term for a place where baked goods can be bought. That's it.

For an English speaker, there's a sharp difference between buying strawberries at a grocery store versus a farmer's market. There's a personal touch to a farmer's market; the strawberries you buy there have a certain quality. There's even a sense of individuality—for example, you might know that Aunt Jane grows, picks, and stores her strawberries in a particular way. In contrast, when you buy strawberries at a grocery store, the personal touch is lost, and you're lucky if there's a recognizable brand name like Driscoll's.

For a Russian speaker, both linguistically and culturally, there's no person and no individual traits associated with "the bakery". The owner feels infinitely distant, and a Russian speaker might refer to them as "they". If there are two bakeries or two grocery stores to choose from, a Russian speaker might think along the lines of "they know how to select their suppliers better at A than at B". There wouldn't be any personal touch involved. If a teenager were to work at a bakery, learning from "the baker" would likely be the last thing on their mind. In fact, the concept of "the baker" as a figure has long disappeared.
I'm not sure what to make of this argument, and I don't want to frame it in terms of "state-provided" or "state-supplied" bakeries. But, in a way, this is exactly what these places have become in the "collective consciousness", at least linguistically.

Such a situation is doubly unfortunate. First, it removes the spirit from the craft, depriving people of something to dedicate themselves to. Second, it eliminates any notion of voting with one's money. Instead, people's decision-making becomes solely about getting a good deal and enjoying the process and the results, with no thought about supporting the right kind of businesses or thinking about the community's future, which is shaped by the collective actions of those who vote with their feet and their wallets.

Please help me refine this thought. First, am I entirely off-base, or does this resonate with you? Second, how relevant do you think this idea is in today's world? Finally, does this perspective relate to the modern approach to high-tech software businesses?
In a sad coincidence of my last three post one is about The Pirate Bay and one is about Telegram and its encryption.

And now Pavel Durov, the founder of Telegram, is arrested in France. Presumably, because France believe he's not cooperating with the authorities with respect to "allowing" "bad things" to happen on Telegram.

So, had end-to-end encryption, esp. with user-owned tokens, been enabled in Telegram by design since a long time ago, this very accusation could not have possibly emerged in the first place. Ah, and especially if the protocols are open source, with canonical implementations, reviewed, or perhaps even implemented, by third-party experts.

Sigh. Wish this situation resolves itself quickly enough in the most positive way possible.
Encryption is not a human right. Encryption is something far bigger.

I can explain this with an absurd analogy. Imagine a country making it illegal for a human being to be strong enough to lift more than a hundred pounds.

Depending on how much sci-fi you read and how vivid your imagination is, Orwellian scenarios of various depths and shapes are likely emerging in your mind right now. You might envision licensed lifting equipment with weight caps, raids on illegal gyms, and various traps to expose people ("suspects") who are physically capable of lifting. There might even be mandatory medical testing to forcibly drug those whose muscles are strong enough.

Undoubtedly, significant pressure can be applied top-down, and the percentage of people capable of lifting over a hundred pounds can be substantially reduced.

However, it's absurd to believe this percentage can be reduced to zero. Many people would still want to be strong, if only to maintain their health or to attract better partners. Furthermore, I'd argue that if a country were to pursue this route, its days would be numbered. Within two generations, three max, it would turn into a failed state. Neighboring countries would likely welcome free-thinking individuals who realize their desire to control their bodies trumps their desire to conform to some "civic conduct" they don't agree with.

The term "human right" is a social construct. We, the humans, decide what human rights are and how we protect them. Human rights are not static, and it's up to us to allow the term to evolve as our civilization grows and matures.

There will always be disagreements about what exactly constitutes a human right, what the acceptable means to defend these rights are, and what liberties we should collectively consent to give up, and to what extent, to protect our human rights.

But much like the absurd analogy of capping the weight one can lift, it's foolish to try to cap one's right to private communication.

Totalitarian governments can make their citizens believe certain historical events never happened. We have evidence that this can be done quite effectively — hundreds of millions of people can be made to act in a way that provides them with sufficient plausible deniability about not knowing that certain events occurred.

But encryption is different.

The Diffie-Hellman key exchange handshake was discovered, not invented. Number theory works regardless of whether one believes in it. Strong algorithms exist that can literally be evaluated on a napkin, which can then be burned. People from all age groups and demographics are increasingly aware of asymmetric encryption these days — some from SSH-ing into servers, and an ever-growing number from interacting with crypto wallets.

This is why encryption is not a human right. It is far more significant than a human right. It's akin to gravity: a concept that exists in the universe, which no sane ruler's normative act can turn off at will.
I repeat: do not ask ChatGPT for illustrations where the concept of correctness exists.

This is just ... a student should be expelled from a university if they present this picture.

What's confusing is that the explanation ChatGPT gave first, to which I answered "yes" as it suggested to generate the picture, was quite decent.
My experiments-turned-tutorial post on C++20 coroutines: https://dimakorolev.substack.com/p/c20-coroutines.
Could not resist running this experiment, and then could not resist turning it into a post: "futuristic" posters generated by ChatGPT 4o.

I'd say we are indeed losing the capacity to dream big.

Prove me wrong. Please.
Lazy Web, I could use some advice.

For context, my thoughts keep circling around three interconnected ideas: OLTP for ACID-level data consistency guarantees, OLAP tied to this OLTP for batch-mode analytics, and "enterprise-friendly decentralization" where one can either leverage a swarm of publicly available nodes for a nominal fee or dedicate several instances in their own data center to these tasks.

OLTP: 𝗣𝗶𝗰𝘁𝘂𝗿𝗲 𝗙𝗶𝗿𝗲𝗯𝗮𝘀𝗲, 𝗗𝘆𝗻𝗮𝗺𝗼𝗗𝗕, 𝗼𝗿 𝗲𝘃𝗲𝗻 𝗦𝗽𝗮𝗻𝗻𝗲𝗿 𝗿𝘂𝗻𝗻𝗶𝗻𝗴 𝗶𝗻 𝗮 𝗱𝗲𝗰𝗲𝗻𝘁𝗿𝗮𝗹𝗶𝘇𝗲𝗱 𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁. Zero configuration, tight latency guarantees, and through-the-roof availability and consistency guarantees. The cost per request is in micro-cents. Additionally, the system is DDoS-resistant by design, as each request must be signed in a relatively costly way, making attacks prohibitively expensive.

OLAP: 𝗡𝗼𝘄, 𝗶𝗺𝗮𝗴𝗶𝗻𝗲 𝗮𝗱𝗱𝗶𝗻𝗴 𝗗𝗮𝘁𝗮𝗯𝗿𝗶𝗰𝗸𝘀, 𝗗𝗮𝘁𝗮𝗱𝗼𝗴, 𝗮𝗻𝗱 𝗦𝗻𝗼𝘄𝗳𝗹𝗮𝗸𝗲 𝗼𝗻 𝘁𝗼𝗽. The same principles apply: zero configuration and pay-as-you-go. An added benefit is the built-in pub/sub bus between the OLTP and OLAP layers, that works out of the box.

Enterprise-friendly decentralization, or what I prefer to call semi-decentralization, applies this idea to small-to-medium businesses. Once you need to tighten access control to some of your data, you can spin up a local cluster and run everything on it. 𝗔𝗹𝘁𝗵𝗼𝘂𝗴𝗵 𝘁𝗵𝗲 𝘀𝘆𝘀𝘁𝗲𝗺 𝗶𝘀 𝗱𝗲𝘀𝗶𝗴𝗻𝗲𝗱 𝗳𝗼𝗿 𝗱𝗲𝗰𝗲𝗻𝘁𝗿𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗳𝗶𝗿𝘀𝘁, 𝗶𝘁 𝗮𝗹𝘀𝗼 𝗮𝗱𝗵𝗲𝗿𝗲𝘀 𝘁𝗼 𝗱𝗮𝘁𝗮 𝘀𝗲𝗴𝗿𝗲𝗴𝗮𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗿𝗲𝘀𝗶𝗱𝗲𝗻𝗰𝘆 𝗿𝗲𝗴𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝘂𝘁 𝗼𝗳 𝘁𝗵𝗲 𝗯𝗼𝘅, if and when needed. "Taking your data private" would also likely be more cost-effective as the business scales. You could even lend your unused storage, compute, and network resources to others during off-peak hours, to support a cause like cancer research or simply to improve your bottom line.

I'll be very vague for the sake of discretion, but there's a team we are considering joining forces with. While their product's long-term vision aligns with mine, their initial market entry point focuses more on data storage and serving rather than the OLTP+OLAP combo as I envision it. They also have over five years of market presence.

Simply put, while my vision revolves around Firebase + Databricks + Snowflake, they are starting with some S3 + Cloudflare, eyeing some Huggingface moving forward.

Lazy web, I'd appreciate your help in evaluating this opportunity. Specifically, I have three questions:

1) 𝙄𝙨 𝙩𝙝𝙚 "𝙙𝙚𝙘𝙚𝙣𝙩𝙧𝙖𝙡𝙞𝙯𝙚𝙙 𝙨𝙩𝙤𝙧𝙖𝙜𝙚 + 𝘾𝘿𝙉" 𝙥𝙧𝙤𝙗𝙡𝙚𝙢 𝙘𝙝𝙖𝙡𝙡𝙚𝙣𝙜𝙞𝙣𝙜 𝙚𝙣𝙤𝙪𝙜𝙝 𝙩𝙤 𝙖𝙩𝙩𝙖𝙘𝙠?
2) 𝙄𝙨 𝙩𝙝𝙚 𝙢𝙖𝙧𝙠𝙚𝙩 𝙡𝙖𝙧𝙜𝙚 𝙛𝙤𝙧 𝙨𝙪𝙘𝙝 𝙖 𝙙𝙚𝙘𝙚𝙣𝙩𝙧𝙖𝙡𝙞𝙯𝙚𝙙 𝘾𝘿𝙉 𝙥𝙧𝙤𝙙𝙪𝙘𝙩?
3) 𝙃𝙤𝙬 𝙘𝙡𝙤𝙨𝙚𝙡𝙮 𝙧𝙚𝙡𝙖𝙩𝙚𝙙 𝙞𝙨 𝙩𝙝𝙚 𝙎3 + 𝘾𝙡𝙤𝙪𝙙𝙛𝙡𝙖𝙧𝙚 𝙥𝙧𝙤𝙗𝙡𝙚𝙢 𝙩𝙤 𝙩𝙝𝙚 𝙁𝙞𝙧𝙚𝙗𝙖𝙨𝙚 + 𝘿𝙮𝙣𝙖𝙢𝙤𝘿𝘽 𝙤𝙣𝙚, 𝙬𝙝𝙚𝙣 𝙩𝙝𝙞𝙣𝙠𝙞𝙣𝙜 𝙖𝙗𝙤𝙪𝙩 𝙛𝙪𝙩𝙪𝙧𝙚 𝙚𝙭𝙥𝙖𝙣𝙨𝙞𝙤𝙣?

I have my own thoughts and answers to these questions but would prefer not to share them yet to avoid biasing your responses. (Feel free to DM me privately if you're curious.)

Thanks in advance!
Pieter Levels wrote a deep tweet. And I am seconding him, plus:

This is extremely well put.

I'd add path to a European citizenship.

A "Blue EU Passport", that lets one live and work in any EU country, be subject to top-level EU taxes, but not access to EU states healthcare and pensions — should be granted after a year of working in the EU.


https://x.com/UniqueDima/status/1833584882871828540
In case it's news to you: electric cars in the US prior to 1920.

https://chatgpt.com/share/c3d0ab93-f0ed-498c-bce3-d147605f167a

* In 1912, there were approximately 30,000 electric cars on U.S. roads.
* Top Speed: Around 15-20 mph (24-32 km/h).
* Range: Typically 30-50 miles (48-80 km) per charge, depending on the model and battery.

I feel the urge to spread this data, since far too many people these days believe we've "just started trying out electric vehicles" a couple years ago. This story is 100+ years old, and, compared to gasoline vehicles during that era, electric cars were not really inferior in their characteristics.
There's something I'm remarkably proud of, both seriously and as a joke. And last night I realized I am not proud of it enough 😊 so I need to write about it!

TL;DR: I take pride and responsibility for coining the term "it's just a hash map" in our (extended) team.

Not sure what the right verb is. I don't think I've trained or mentored/coached anyone in particular. Some people called my approach Socratic, and that's also music to my ears. Realistically though, we've just worked on something big together.

The trend is super clear though.

When I was joining the team, there was often the talk about cache, durable vs. non-durable cache, partitioning/sharding, TTL, various 3rd party services and/or libraries to implement these ideas, you name it.

The first year I was just gently explaining my thought process. Because stuff often fits one machine, especially when LONG_AND_USELESS_ENUM_VALUES are bitmasks. A pub-sub bus enables keeping the cache up to date as long as the contract with the publisher is carefully spec'ed out. A beefy machine with enough RAM is quite cheap these days. And implementing some LRU eviction logic is an exercise that should take not more than half a day from a senior++ developer; maybe 1.5 days with tests, of course, but not more than a week in total including fighting with Kuberenetes to stress-test it for real.

After about nine months the tide started to turn a bit. Instead of solving everything with ElastiCache or DAX or what not, we began experimenting with Redis+Lua to prove certain points. People began to buy into this. Not necessarily Erlang/Rust-level buy into this, but the jokes about "Mongo is Web Scale" faded away.

And when I was leaving the team after 2+ years, during farewell chat, we all agreed "it's just a hash map!" is a signature phrase in the team that would outlive me.

Seriously. A lot of things are just hash maps. Most caches do not have to be durable. Don't worry about the cold start. Heck, don't even worry about consistency: you'll be paged for high replication lag regardless, and if things already went wrong it's the speed of recovery that matters most. Just design the system that does its job and recovers quickly. If this system is three ~96GB RAM boxes that store the full copy of the working set of data each — that very often is the best solution.

I wish each and every senior++ engineers to have worked both in the teams where the above is the default mindset and in the teams where the above is worth evangelizing and fighting for.