Not boring, and a bit of a condescending prick
151 subscribers
10 photos
71 links
Semi-digested observations about our world right after they are phrased well enough in my head to be shared broader.
Download Telegram
So, in a company I'm helping with the design, we couldn't come up with a memorable name for the core component of our service.

Partly out of desperation, I looked into Sci-Fi for inspiration, and suggested we call it Arrakis. Somehow, the folks liked it.

That was late last week. Now, over the weekend, I'm thinking that the piece that grabs data from all other places and feeds it into our core should be called a [Spice] Harvester.
Regulation-heavy folks, I have a question about how the modern-day data residency rules affect core services such as authorization.

Say, your service is the one that ultimately decides if a user can only view a Google Doc, leave comments there, or edit it directly.

Now, you're a European user, subject to GDPR, of course, and you are finding yourself in the US.

It most certainly is not the case that every single time you attempt to change something in this Google Doc a request goes out to the European server (where "your data" "resides") so that that European server may confirm you are indeed granted that permission?

~ ~ ~

My limited understanding of data residency regulations is that it only affects the storage of PII (personally identifiable information). In other words:

• If you need a cache of permissions for a particular user token, for example, you are absolutely allowed to keep it anywhere, as long as that user token does not contain this user's PII, right?

• If your data is ephemeral (you just store it, in memory, on your own server), and you never respond back with it, you are absolutely allowed to serve the requests that use this data, right?

~ ~ ~

For example, your legal name is PII, so we shouldn't store it anywhere, outside the servers in Europe. Still, if you're opening your Google account, and then its settings, from America, it's an American "frontend server" that is doing the rendering of this page, right?

Thus, once you, as the user, have requested this data, it is allowed, in some form, to pass through a Google data center in America. There might be constraints on not storing this data in any caches, or any web server logs, for instance, or constraints on keeping it encrypted while in transit, but it's not that this data can't cross the country border. Or am I wrong?

~ ~ ~

On the other hand, the tricky case is age. Say, you are a European user, and you have consciously consented to share your age with YouTube in Europe. Now you are in the US, and are about to watch some videos.

Say, the publisher of a video can make it restricted to an arbitrary age. I am making it up, just bear with me. In order to answer the question whether you can watch a certain video, the age restriction on the very video would have to be compared to your age.

Thus, a "malicious" actor can create a hundred videos, 1+, 2+, 3+, ..., 99+, 100+, and then issue requests of the kind of "can this user watch this video?" The authorization service would never disclose the age, but, as all computer scientists understand, it would take seven requests to get to know this user's age down to a year.

~ ~ ~

What does the regulation say here?

If a user from Europe is opening YouTube in the US, on which side of the Atlantic should the very check happen?

For how long can this result be cached?

And is the service liable if, say, a change in important data did not propagate to the other side of the ocean quickly enough, causing the user to mistakenly be allowed to and/or prevented from accessing certain content?
I recall how many years ago Microsoft was made to divorce Internet Explorer from, well, the components of Windows that are integral to the OS. And that was good.

Now -- yes, I do reboot into Windows every now and then -- when I click on the text to expand what that beautiful login screen picture is about ...

... I get to choose between "Edge" and "Edge", with the third option being "Choose from Microsoft Store". And this choice begins with the "search query" of, drum rolls, microsoft-edge.

Just so that we are clear: I have Brave, Firefox, and Chrome installed. And need to open a URL. But Windows would not show me that URL unless I "agree" to use Edge to open it. For real, just a "copy URL" option would work for me -- but it is an "essential OS feature", apparently, to tell me in which exact part of the Indian ocean was that photo taken.

I don't even know what to make of it; I can't even convince myself to dislike Microsoft due to this. It's the rules of the game that companies play by these days. Somehow we're back to square one, where these rules are nowhere near being pro Us, The People, and they are, again, almost entirely pro corporations.

Seriously, in the US, the only "recent "example of an IT regulation I can think of that made life better for an average Joe or Jane was that one can keep their phone number when changing carriers. That's about it.

Back to Microsoft, I don't use macOS, but I do use an iPhone. When that iPhone wants to install its update, I trust Apple to do it. When my Windows 10 "recommends" upgrading to Windows 11, I'm finding myself automatically looking for an option to never offer it again.

And here we are, with the same old company, employing the same old tricks, being in the tops of the world capitalization-wise. It's a really well run company. It cares about its employees diversity and inclusion. It cares about climate. It cares about privacy. The only thing that seems to be missing is caring about these simple things that ruin the experiences of some end users.

I'm going to go all in though and postulate that it's the right strategy for the company. Because the people who truly demand these features from an OS do settle for Linux or for macOS, and are not the target audience of Microsoft Windows. Personally, I wouldn't boot into Windows if my Razer camera would work as well -- 60 FPS Full HD with proper lighting correction and background blur -- on Ubuntu as it does on Windows.
Our next #SystemDesignMeetup episode, must be real good, especially if you are into stream crunching and data processing: comparing #Kafka and #RabbitMQ.

This actually is the first half of what I have to share about the Distributed Task Queue problem. We just decided to split this into two, as there's quite some content on the topic that is not specific to the very problem, but rather talks about message queues used out there in prod.
I just merged in a pull request from dependabot that bumps the versions of some npm modules. And then five more emerged, as if this bot is a hydra. And I've merged in all of them.

This, most likely, is a good thing by itself. But I can smell a lot of possible bad things coming out of this. I mean, this approach is not security through obscurity, but it is creating a lot of room for such in the future.

Just imagine developers all over the world learning to blindly click "merge" on three-lines pull requests to their package-lock.json files. What if a package is hijacked, and updating it results in running malicious code on your production box, or on your home+work laptop? What if a zero-day vulnerability, in node or in npm, or elsewhere, is found, and this kind of automatic pull request is [ab]used to trigger an avalanche of people updating, by accident or through second-order malicious intent?

Finally, what percentage of people would blindly accept a +3 lines change from a user that is not @dependabot, but some seemingly unnoticeable alteration of the name? I'd bet at least a few percent. And how many bad things can be squeezed into three lines? An unbounded number of them. And what are a few percent of developers? 10K+ repositories? 100K+?

If anything, this human factor of potential (and real) over-reliance on semi-automated "improvers" of our code is the strongest argument I have today in favor of using several repositories. Monorepo FTW is my motto in many situations, but, at least from a purely damage control perspective, multiple repos help keep some boundaries tight.

Yesterday I could argue that if some TypeScript code and some C++ code are sharing the same data schema, then it is quite beneficial to keep this code, and the relevant schema, in the same repository.

Today I would think twice before allowing package[-lock].json anywhere in a repository where the main language is C++, and where various pre-submit scripts and post-merge actions do build and run this code on the servers that I care for security-wise.

Or maybe this post just signifies that I've bought into the religion of GitHub actions at scale. Yes, they are quite wasteful, true that. Yes, they often introduce extra complexity. But in our unstable world they do seem to be the safest technique of loose coupling in environments where zero trust is quickly becoming the norm, especially when it comes to popular, widely depended upon, and somewhat fragile open source projects.
For a while, my understanding of what WebAssembly, wasm, is good for fit roughly the following pattern:

• There is an isolated, mostly algorithmic, task.
• This task is best to be solved on the client side, in the browser.
• One could think of writing an npm module that solves this task.
• But an implementation in another language already exists, on the backend (BE).
• So just take this implementation, wrap it into wasm, "ship" the very function to the frontend (BE), and you're done.

One big benefit of this approach is that the FE itself becomes fully decoupled from the implementation of this magic function. There is no dependency on the FE developer, on the build process of the FE repo, or the like. Local iterations are fast and local, just as I like them.

Plus, the tests for the existing code effectively test the wasm-transpiled code as well. In my case, when the code is C++, tested with googletest, and the transpiler is Emscripten (emsdk), everything just works out of the box. I only needed to add a build target so that, among other things, my freshly-built BE also exposes the very transpiled wasm function.

On the team dynamics level, it is awesome that the FE people can focus on making the product look & feel great to the user, while the BE people can do the data heavylifting part. And this way to use wasm does the job indeed.

But the above is Level One of wasm. I'm enlightened now, and can fast track you through the full journey.

Level Two: Ship the "demo" of your SPA with 100% of its BE code served by browser-run wasm.

Chances are, your BE is already implemented in a language that can be transpiled into wasm (node.js, Java, C++, Go, all fit). So, it's a one-weekend hackathon for your BE team to make sure 100% of the necessary code is shipped as wasm, and then another one-weekend hackathon for the FE team to integrate it.

The win of this approach might look minor, but, trust me, it is oh so important for a great demo. It is that your web app becomes extremely, insanely responsive.

Chances are, especially if you are a startup, that you have taken quite a few shortcuts here and there. Extra network round-trips are being made, and/or some operations that can be cleverly combined into synchronous updates (ref. BFF) are now asynchronous. Well, with wasm — assuming the amount of data you are crunching is small, of course — consider these problems non-existent. If all you need is to run to serve the response from a BE is a few quick for-loops, trust me, your wasm engine would likely be able to do it in single-digit milliseconds; indistinguishable from instant by most Web App users these days.

Level Three: Automagically add "offline mode" to any SPA ("single-page [web] app"), with no code change. It is surprisingly easy once this very SPA already contains an initializer for the wasm wrapper. Once it is there, literally all one needs to make an SPA offline-friendly is to:

• Build the whole BE, including the data it needs for the demo, into a single wasm bundle,
• Add the logic to intercept XMLHttpRequest calls during the very SPA initialization of the wasm part, and then
• Whenever the FE code is calling the API, don't make an outgoing HTTP call, but call the now-local wasm function instead.

I have not done this myself; and I would probably not go this far with automating the process. But the idea is just too brilliant for me to not share it.

Who knows, maybe soon we would learn to launch full Docker containers via wasm. There is a fully functioning wasm Linux distro, after all, so the path is paved. And then someone could ship an npm module that would just take a Docker container, wrap it into wasm, and then surgically intercept the calls to the BE to now be served within the browser, probably running this Docker container in a separate Web Worker.

This someone might be me, but I've got more things to build now. So, the idea is up for grabs — feel free to!
The whole story is some 5x longer, but today I had the most fascinating conversation with a road policeman in my life. After about fifteen minutes of communication via Google Translate, it ended the following way:

— You made a turn left where the left turn is prohibited.
— Sir, there were no prohibiting signs along the route I took.
— There was a sign. You made an illegal turn.
— There's no sign there! I'm always careful, Swiss driving school. Why would I violate the rules if there was a sign?
— There was a sign.
— Nah, there was no sign at all. Wanna go together and check?
— Hmm ... how about this: we go there, and IF THERE IS A SIGN YOU PAY DOUBLE THE AMOUNT?

There was no sign. Also, I didn't like how this guy handled his machine gun; I had to keep it next to his body with my own knee while sitting behind on his ATV for it to not point at my calf.
I'm looking into setting up open source reproducible blueprints / performance tests. So that they can be run on the hardware paid by the company, but the code remains open & reusable. Is this a crazy idea?
I was just interviewing a staff level candidate, and, during the reverse interview, they asked me about what levels are there in the company for the staff level to aspire to.

While clarifying the question, it became clear that among what they are looking to explore is the process of standardizing decisions across the company. About microservices, patterns, languages, storages, etc.

The current job title of this candidate is a senior engineer. I happened to know this before the interview because they looked me up on LinkedIn yesterday.

So I gave my most sincere answer:

• As a senior person, you are encouraged to clean up the mess that is affecting your and your team's performance.

• As a staff person, you are expected to prioritize the mess, and even, at times, embrace the mess.

In a way, I was brutally honest, as I like to be. Because the observation I shared holds true in all the companies I was part of. But it's somehow not what the interviewer is proud of, and hence it's not something interviewers would generally share when asked.

Instead, I turned the question around. I shared the big initiatives I am aware or and/or am part of, such as microservice interfaces standardization. I explained why grass roots bottom up initiatives rarely work when it comes to company-wide standards. And I reassured the candidate that, while we are working on reducing the mess where it affects our day duties negatively, they better be fully prepared to experience a certain degree of messiness should they join.

And that they will not be expected to clean up every single mess they encounter. On the contrary, from a staff level person, the company would expect them to exercise their judgement on which messes are best to be taken care of, and which are not worthy of it strategically.

The candidate was happy. So am I!
Frontend and full stack folks, how do you deal with those huge diffs in package-lock.json and with all the */__snapshots__/* stuff?

I'm aware that package-lock.json is recommended to be put under source control. But at least in a separate commit, right?

Maybe there exists a GitHub setting to exclude certain files (or certain paths) from diffs? And from pull request line counters deltas? Or maybe put two pairs of numbers there, +added, -removed, +added_boilerplate, -removed_boilerplate?

Maybe I could put some .gitignore-boilerplate file and the problem would be gone?

Maybe there's a browser extension that would emulate this behavior? (Maybe I should build one or invest in a team that is building one?)

Maybe there's something a frontend developer should know that reduces or eliminates the bloat described above?
So I was doing a "sale call" interview today.

That's the first conversation the candidate has after speaking with the recruiter. That is, the first technical conversation. I am not even expected to provide feedback, and I share this with the candidate first thing, to keep them relaxed.

The objective of the sale call is to make sure the candidate is sufficiently interested in the role, and to gather some info on what teams and roles could they be a good fit for. It is not completely impossible to "fail" this "interview", but, unless the candidate truly wants to bail out, it's hard to imagine a conversation after which the company would want to pass on them.

After a routine past experience chit-chat, this candidate asks me up front: What do you do? And then: What would I do if I join?

Well, he likes designing systems. And his resume has all the relevant bits and pieces. And he has signed the NDA.

So, you're asking what do I do? "Hold my beer", I say figuratively, while closing all confidential tabs. I then zoomed in on that Miro board sufficiently to hide all the people and project names, and screen-shared our design diagram. "Look, here's what we do. That's about what you're looking for, right?"

Guess what happened next?

He criticizes our design!

The candidate, on a sale call that is not even an interview, is telling me that the design we've spent months and months on is suboptimal, and that he knows how to make it better.

Now pause reading for a moment and note your emotion. Those of you who know me well can already predict what would happen next. For the rest of you: enjoy responsibly.

Clearly, I bit the bullet. Agreed with all his comments. Provided a few extra real-life constraints. And one more time, and then one more time. Until he agreed it's not that clear, and that he would probably make the same decisions as we did should he have all the same context.

I thanked him sincerely for the conversation, and assured him I've gathered enough signal about what the best role for him in our company looks like. I've told him I'm putting my feedback straight in, and that the recruiter would be back in touch shortly.

Then I took a deep breath and wrote my feedback. Plus recruiters notes. Along the lines of:

• This is a great candidate.
• We need more people like him.
• I did not interview him technically, but would bet he's strong.
• And we need to do our best to make sure that, provided he passes the tech screen, he receives a great offer and joins us.

In the industry, we need more people who ignore all the rules & conventions, and just open up with what they believe is the point of maximum impact. I love working with them. I hope to be one of them. And, in my experience, this is paying off greatly, both professionally and personally.
Incredible that we still live in such a world.

There's data of major importance. It is clearly affecting many people's short-term life choices. This data is about high-level decisions made inside public companies. A large number of people within each respective company know exactly what those decisions are.

Moreover, those decisions can mostly be communicated via a yes/no answer to a trivial question: Is there a hiring freeze?

To add to this, we have an enormous amount of information shared between us all. Take Blind, for example, where everything leaks regardless.

And yet, we have to rely on error-prone and noisy mediums, such as — surprise! — good old polls, to have a glimpse into what's really going on. Seriously, "Don't look up", we have this covered.

To this day, I don't understand how public companies are under no obligation to disclose what roles they are hiring for, and the state of their funnel, at any given day. This reporting would deal no harm to a company if done by everyone. It's like financial reports, even less transparent. (My second question, by the way, is what do recruiters do during hiring freezes.)

Kudos to Aline for being a ray of light in this conversation:
I'm wholly unqualified to be a game developer. I mean, this many glitches to use and abuse? Even a mere possibility for one of them to emerge would keep be up late at nights trying to find a clean solution that makes the glitch impossible, or, at least, unattainable through regular gameplay mechanics.

(Inspired by jetlag, 5am, and watching the "Brood Lords range 20" video, where Korean pros finally found a way to abuse attacking own broodlings to keep their Brood lords unreachable for Thors. However, a friend points out, this trick would only be handy within a narrow opportunity window while your opponent does not have a chance to transition into air; or, much like in the very replay, if it's already lategame with Zerg's opponent staying mostly on land.)