Any OS / kernel experts here?
Could you fill me in on
On the one hand, I'm inclined to say that gRPC/protobuf would be notoriously slow here, while, if trivial batching + deserialization happens at the kernel level, we'll get far higher throughput, together with lower latency.
On the other hand, I'd assume problems that are "RPC-bound" do not use gRPC these days already, or would that be a wrong assumption?
As someone who, to a decent degree, identifies as a [high-performance] infrastructure engineer, in addition to being a [realtime] data architect, I'm very curious what are the modern-day bleeding edge problems in the field.
Also, maybe there are use cases for the above in the space of blockchains?
Could you fill me in on
io_uring and [e]BPF when it comes to application-level networking?On the one hand, I'm inclined to say that gRPC/protobuf would be notoriously slow here, while, if trivial batching + deserialization happens at the kernel level, we'll get far higher throughput, together with lower latency.
On the other hand, I'd assume problems that are "RPC-bound" do not use gRPC these days already, or would that be a wrong assumption?
As someone who, to a decent degree, identifies as a [high-performance] infrastructure engineer, in addition to being a [realtime] data architect, I'm very curious what are the modern-day bleeding edge problems in the field.
Also, maybe there are use cases for the above in the space of blockchains?
👍1
Benchmarking Production Systems.
A new @SysDesignMeetup episode is out.
There is much more to load testing than just figuring out how many requests can a service handle per second.
Video: https://tinyurl.com/sdm-benchmarking
Slides: https://tinyurl.com/sdm-benchmarking-slides
A new @SysDesignMeetup episode is out.
There is much more to load testing than just figuring out how many requests can a service handle per second.
Video: https://tinyurl.com/sdm-benchmarking
Slides: https://tinyurl.com/sdm-benchmarking-slides
Telegram
SysDesign Meetup
SysDesign Meetup.
https://www.youtube.com/c/dimakorolev
https://twitter.com/sysdesignmeetup
Also @BoreMeNo
https://www.youtube.com/c/dimakorolev
https://twitter.com/sysdesignmeetup
Also @BoreMeNo
👍4
Our guide is out in full: https://interviewing.io/guides/system-design-interview/part-four
Part four is probably more advanced than others (i.e. is more for seniors++ than for mid-levels), but it’s a good piece overall, imho, likely best prep material available online.
Enjoy responsibly :-)
Part four is probably more advanced than others (i.e. is more for seniors++ than for mid-levels), but it’s a good piece overall, imho, likely best prep material available online.
Enjoy responsibly :-)
interviewing.io
Anonymous mock technical interview practice
Mock interviews with engineers from FAANG and other top companies. Get actionable feedback, get awesome, get fast-tracked.
👍12
Riding the wave further, the SysDesign Meetup event, dedicated to interviewing.io's SysDesign guide is happening soon!
Pls join #iio-sd-guide on our Slack if you're interested, and please ask your questions in Slido, regardless of whether you plan to be there live. And here's a Google Calendar link.
Very much looking forward to!
San Francisco, USA Fri, 7 Apr 2023 at 11:00 PDT
New York, USA Fri, 7 Apr 2023 at 14:00 EDT
London, United Kingdom Fri, 7 Apr 2023 at 19:00 BST
Amsterdam, Netherlands Fri, 7 Apr 2023 at 20:00 CEST
Tbilisi, Georgia Fri, 7 Apr 2023 at 22:00 GET
Kuala Lumpur, Malaysia Sat, 8 Apr 2023 at 02:00 MYTPls join #iio-sd-guide on our Slack if you're interested, and please ask your questions in Slido, regardless of whether you plan to be there live. And here's a Google Calendar link.
Very much looking forward to!
interviewing.io
Anonymous Coding & Technical Interview Prep for Software Engineers | interviewing.io
Get actionable feedback, get awesome at technical interviews, and get fast-tracked at top companies.
👍10
A semi-crazy idea: a decentralized mesh blockchain run on the hardware of pet lovers, powered by their pets.
Say, you have a hamster, or a few to officially indulge the use of the pronoun of the year. You feed them well, and treat them well; please stop reading here if it's not the case.
Now, you'd love your hamsters to play and you get them a running wheel.
Here's the thing: at this point, modern technology can run a proper blockchain that would only require less computational power than your hamster generates.
First of all, this already is a wonderful plot, because we're saving trees and overall care for the environment.
Second, there are tons of cases for blockchain that are general-purpose, and yet widely under-utilized.
My favorite example would be a blockchain Twitter. Everybody can post a short message, and everybody else can read this message. This short message can be encrypted, so that only those with the key can decipher it. And the original author can publish the key later, granting the public access to whatever they have written before, with accurate timestamps in the past.
Imagine a journalist reporting from a dangerous area. The area is so dangerous that they can be lost in action there, while their material may well be of supreme importance. So, it would be nice to be able to report right from that area, so that only the select few people could read their report live, but, later on, these notes could be made public.
And all of this may well happen over a mesh network powered by hamster lovers in that area and across the globe. With no external power supply, and with no global Internet available.
And everyone can use this Twitter. For free. The "price" of posting one Tweet is measured in the computational power you have contributed to running this mesh network, and its cost is roughly the amount of energy a few hamsters generate by spinning their wheel for several minutes.
Ah, and this Twitter would be immutable. So nobody, including Elon Musk himself, would be able to delete content from there.
The very idea is infeasible, of course. For far too many reasons, and for most intents and purposes. But it's a cute idea regardless, isn't it?
Say, you have a hamster, or a few to officially indulge the use of the pronoun of the year. You feed them well, and treat them well; please stop reading here if it's not the case.
Now, you'd love your hamsters to play and you get them a running wheel.
Here's the thing: at this point, modern technology can run a proper blockchain that would only require less computational power than your hamster generates.
First of all, this already is a wonderful plot, because we're saving trees and overall care for the environment.
Second, there are tons of cases for blockchain that are general-purpose, and yet widely under-utilized.
My favorite example would be a blockchain Twitter. Everybody can post a short message, and everybody else can read this message. This short message can be encrypted, so that only those with the key can decipher it. And the original author can publish the key later, granting the public access to whatever they have written before, with accurate timestamps in the past.
Imagine a journalist reporting from a dangerous area. The area is so dangerous that they can be lost in action there, while their material may well be of supreme importance. So, it would be nice to be able to report right from that area, so that only the select few people could read their report live, but, later on, these notes could be made public.
And all of this may well happen over a mesh network powered by hamster lovers in that area and across the globe. With no external power supply, and with no global Internet available.
And everyone can use this Twitter. For free. The "price" of posting one Tweet is measured in the computational power you have contributed to running this mesh network, and its cost is roughly the amount of energy a few hamsters generate by spinning their wheel for several minutes.
Ah, and this Twitter would be immutable. So nobody, including Elon Musk himself, would be able to delete content from there.
The very idea is infeasible, of course. For far too many reasons, and for most intents and purposes. But it's a cute idea regardless, isn't it?
Sometimes, if I catch myself having sad thoughts about the state of IT, I remind myself that for literally every historical period of human existence I would be worse off.
I love employing critical thinking to solve problems, and I was born into an ordinary family. With these inputs, thus, looking around, nothing to complain about, really.
Then the thought goes on, to the point where I have to admit that I do believe, and almost assume, that in the future life would be even better. So that if a Dima born in 2100 would look back to 2023, he would be quite upset about what the 2023 Dima has to put up with.
This is, of course, normal; that is what is called progress, and this, too, is nothing to be unhappy about.
When I look into the past I feel terrible just thinking about how much crap would I have to put up with in order to just not be burned alive, regardless of how much evidence I can present to support my views..
The next thought is to wonder whether this trend can, and/or will, at a certain point, stop or reverse.
And here it gets really sad, because it seriously looks like it may well be true that the 90s and the 2000s were a better time for us, first-principled technologists.
Of course, we did not have powerful mobile devices / mobile internet, and the concept of working from home was largely non-existent. And, of course, we had corporate structures that are a lot more "traditional", in a sense that acquiring real-life professional skills and growing one's career was substantially more difficult back then than it is today.
But disrupting it was also much easier. Or at least so it seems.
Now, looking back at the risk of being burned alive, this facet of problems was virtually non-existent in the 90s and the 2000s, but today -- just look around. There's no shortage of topics one is supposed to either not have the "wrong" opinion on, or explicitly not touch in any conversation, unless with close friends.
And this trend, that has been brewing for some fifteen years by now, begins to touch on the topics that have direct impact on what the companies declare to be their values. To the point that major executives literally admit that they have to employ Orwellian double-think in order to stay sane while being effective at their jobs.
Just to make it clear: I'm fine, I got used to this, I can see the value of PC when it comes to merely extending the idea of politeness so that a bunch of people of very diverse sets of backgrounds and values can be effective at getting things done together.
But looking at companies that claimed to be "a big family" with their employees just a few years ago, and are now too shy to pre-announce upcoming layoffs internally, effectively keeping everyone in the guessing mode day after day ... that's just wrong. And this openly contradicts the declared values of most of these companies, and yet this contradiction is supposed to be something that all of us just turn a blind eye to.
Kind of surreal to watch, if you ask me.
I love employing critical thinking to solve problems, and I was born into an ordinary family. With these inputs, thus, looking around, nothing to complain about, really.
Then the thought goes on, to the point where I have to admit that I do believe, and almost assume, that in the future life would be even better. So that if a Dima born in 2100 would look back to 2023, he would be quite upset about what the 2023 Dima has to put up with.
This is, of course, normal; that is what is called progress, and this, too, is nothing to be unhappy about.
When I look into the past I feel terrible just thinking about how much crap would I have to put up with in order to just not be burned alive, regardless of how much evidence I can present to support my views..
The next thought is to wonder whether this trend can, and/or will, at a certain point, stop or reverse.
And here it gets really sad, because it seriously looks like it may well be true that the 90s and the 2000s were a better time for us, first-principled technologists.
Of course, we did not have powerful mobile devices / mobile internet, and the concept of working from home was largely non-existent. And, of course, we had corporate structures that are a lot more "traditional", in a sense that acquiring real-life professional skills and growing one's career was substantially more difficult back then than it is today.
But disrupting it was also much easier. Or at least so it seems.
Now, looking back at the risk of being burned alive, this facet of problems was virtually non-existent in the 90s and the 2000s, but today -- just look around. There's no shortage of topics one is supposed to either not have the "wrong" opinion on, or explicitly not touch in any conversation, unless with close friends.
And this trend, that has been brewing for some fifteen years by now, begins to touch on the topics that have direct impact on what the companies declare to be their values. To the point that major executives literally admit that they have to employ Orwellian double-think in order to stay sane while being effective at their jobs.
Just to make it clear: I'm fine, I got used to this, I can see the value of PC when it comes to merely extending the idea of politeness so that a bunch of people of very diverse sets of backgrounds and values can be effective at getting things done together.
But looking at companies that claimed to be "a big family" with their employees just a few years ago, and are now too shy to pre-announce upcoming layoffs internally, effectively keeping everyone in the guessing mode day after day ... that's just wrong. And this openly contradicts the declared values of most of these companies, and yet this contradiction is supposed to be something that all of us just turn a blind eye to.
Kind of surreal to watch, if you ask me.
👍11
A lot of ideas that float around with respect to AGI Alignment seem to converge to a rather trivial dichotomy:
Do you want to help people better achieve their goals / live happier lives as they themselves would like to? (1)
Or:
Do you believe people, en masse, are not good enough at making decisions for themselves, so that, to a certain degree, some form of "positive propaganda" would be a push for the greater good? (2)
If you would have asked me ~ten years ago, I would most definitely argue that (2) is a recipe for disaster, while (1), although quite hard, is the way to go.
Looking at the world today though, I am far less convinced.
Because even if — even if! — we believe human beings are good at deciding for themselves, and even if — even if! — we believe in some form of a well-functioning democracy, so that the will of the majority of people is what drives the society forward ... I have major questions about the sustainability of such a model in the long run.
To put it blunt: I am not positive I would definitively pick such a society, however much of a utopia it may sound, over a different society, some more similar to the 1990s / 2000s USA.
Because I am legitimately afraid such a society, however much of a utopia it might be, would not legitimately enable me to execute on all my desires. Despite the fact that, objectively, none of my desires can abuse or in any way harm the other citizens / inhabitants of such a seemingly perfect society.
Not sure what to make of the above thought. But to me it does, at the very least, serve as good reminder that we are nowhere near being able to postulate the AGI Alignment problem well enough. Because, with a perfect AI or with no AI, we, humans, are having major trouble deciding for ourselves what our lives should be in the long run.
Go figure ¯\_(ツ)_/¯
Do you want to help people better achieve their goals / live happier lives as they themselves would like to? (1)
Or:
Do you believe people, en masse, are not good enough at making decisions for themselves, so that, to a certain degree, some form of "positive propaganda" would be a push for the greater good? (2)
If you would have asked me ~ten years ago, I would most definitely argue that (2) is a recipe for disaster, while (1), although quite hard, is the way to go.
Looking at the world today though, I am far less convinced.
Because even if — even if! — we believe human beings are good at deciding for themselves, and even if — even if! — we believe in some form of a well-functioning democracy, so that the will of the majority of people is what drives the society forward ... I have major questions about the sustainability of such a model in the long run.
To put it blunt: I am not positive I would definitively pick such a society, however much of a utopia it may sound, over a different society, some more similar to the 1990s / 2000s USA.
Because I am legitimately afraid such a society, however much of a utopia it might be, would not legitimately enable me to execute on all my desires. Despite the fact that, objectively, none of my desires can abuse or in any way harm the other citizens / inhabitants of such a seemingly perfect society.
Not sure what to make of the above thought. But to me it does, at the very least, serve as good reminder that we are nowhere near being able to postulate the AGI Alignment problem well enough. Because, with a perfect AI or with no AI, we, humans, are having major trouble deciding for ourselves what our lives should be in the long run.
Go figure ¯\_(ツ)_/¯
👍3
Forwarded from SysDesign Meetup
Blockchain and Web3: just released this new episode!
• https://tinyurl.com/sdm-blockchain
• https://tinyurl.com/sdm-blockchain-slides
The history and the state of the art of blockchain and Web3 from first principles.
• https://tinyurl.com/sdm-blockchain
• https://tinyurl.com/sdm-blockchain-slides
The history and the state of the art of blockchain and Web3 from first principles.
YouTube
Blockchain and Web3:: SysDesign Meetup :: 2023-May-20
The Blockchain technology and the world of Web3 as seen through the lens of "conventional" SysDesign
Slides: https://tinyurl.com/sdm-blockchain-slides
The links to the amazing blockchain demo by Anders Brownworth:
• https://www.youtube.com/watch?v=_160oMzblY8…
Slides: https://tinyurl.com/sdm-blockchain-slides
The links to the amazing blockchain demo by Anders Brownworth:
• https://www.youtube.com/watch?v=_160oMzblY8…
https://medium.com/miro-engineering/how-miro-leverages-open-policy-agent-to-implement-authorization-as-a-service-763f08469e5 — Peter, our Staff Engineer, wrote an in-depth blog post about what we are working on at Miro these days.
Medium
How Miro leverages Open Policy Agent to implement authorization-as-a-service
Learn about Miro’s challenges with authorization and how Open Policy Agent has helped Miro build a dedicated service for app authorization.
👍4
I’ve used this argument a lot, but have never seen it employed seriously before. So, as I’m not likely to be a genius, I must have missed something major in tech and/or sci-fi, so do enlighten me pls.
The argument is: in the context of sustainable human colonies (underwater, the orbit, Mars, anywhere), why do we need large populations to retain large gene pools?
We can print the desired DNA molecules today, and we have plenty of DNAs scanned and stored. This data is virtually impossible to lose and/or to erase.
And we can control the DNA of an embryo. I.e. we can have a woman conceive and carry to birth a human child of a desired DNA.
Thus, even with no artificial wombs whatsoever, the human race as we know it today may well, in theory, proliferate in the conditions as harsh as a spacecraft traveling thousands of years with very scares food supply.
Because we literally only need a single-digit number of women to be alive at a time. Plus a hard drive with plenty of further DNAs to choose from. Plus that very technology and the means to operate it, which also has to be single-digit humans, and they don’t even have to be women.
Not hundreds of thousands. Nowhere near this huge number.
This, or what am I missing?
The argument is: in the context of sustainable human colonies (underwater, the orbit, Mars, anywhere), why do we need large populations to retain large gene pools?
We can print the desired DNA molecules today, and we have plenty of DNAs scanned and stored. This data is virtually impossible to lose and/or to erase.
And we can control the DNA of an embryo. I.e. we can have a woman conceive and carry to birth a human child of a desired DNA.
Thus, even with no artificial wombs whatsoever, the human race as we know it today may well, in theory, proliferate in the conditions as harsh as a spacecraft traveling thousands of years with very scares food supply.
Because we literally only need a single-digit number of women to be alive at a time. Plus a hard drive with plenty of further DNAs to choose from. Plus that very technology and the means to operate it, which also has to be single-digit humans, and they don’t even have to be women.
Not hundreds of thousands. Nowhere near this huge number.
This, or what am I missing?
🤔2
Forwarded from SysDesign Meetup
One of the ideas we are flirting with is to sketch out some “open source design” of a nontrivial architectural component.
If done well, not only it could serve as a reference document on how to prepare such designs, but also a blueprint based on which we could implement the prototype.
I think there’s room for such a design out there today, from contracts and protocol specifications (HTTP JSON to gRPC to binary) all the way to components’ implementation and unit/integration/load testing.
As an example problem I keep coming back to either the Total Order Broadcast engine or a simple De-Duplicator.
The former can be viewed as an extension of a service mesh proxy, which inserts a sequence ID into every requests, guaranteeing that each and every ID would be unique.
The latter can be viewed as something similar, but less so on the transport level, focusing on respecting the logic of idempotency tokens instead.
Do you think this would be a good idea? Would you be interested in participating in parts of such “open designs” one way or another? Let me know pls.
If done well, not only it could serve as a reference document on how to prepare such designs, but also a blueprint based on which we could implement the prototype.
I think there’s room for such a design out there today, from contracts and protocol specifications (HTTP JSON to gRPC to binary) all the way to components’ implementation and unit/integration/load testing.
As an example problem I keep coming back to either the Total Order Broadcast engine or a simple De-Duplicator.
The former can be viewed as an extension of a service mesh proxy, which inserts a sequence ID into every requests, guaranteeing that each and every ID would be unique.
The latter can be viewed as something similar, but less so on the transport level, focusing on respecting the logic of idempotency tokens instead.
Do you think this would be a good idea? Would you be interested in participating in parts of such “open designs” one way or another? Let me know pls.
👍2❤1
Continuing with the "crowdsourced open source design" idea, after a few review iterations, a set of principles on how we can develop and publish them is now public: https://github.com/SysDesignMeetup/sdm
I promise to not spam this channel much, but want to get the word out broader, that this thought has now graduated from a mere idea to something that we're executing upon.
My dream of some Architect's Corner born out of the @SysDesignMeetup and our community has taken a major step towards becoming reality!
I promise to not spam this channel much, but want to get the word out broader, that this thought has now graduated from a mere idea to something that we're executing upon.
My dream of some Architect's Corner born out of the @SysDesignMeetup and our community has taken a major step towards becoming reality!
GitHub
GitHub - SysDesignMeetup/sdm: Everything about the SysDesign Meetup by @dkorolev.
Everything about the SysDesign Meetup by @dkorolev. - SysDesignMeetup/sdm
👍2
Publishing Enriched Data Update Events
"When caching and "poor man's replication" play well together."
I've put together quite a few thoughts, and would appreciate external wisdom at how to approach them from the larger eng org standpoint.
https://dimakorolev.substack.com/p/publishing-enriched-data-update-events
"When caching and "poor man's replication" play well together."
I've put together quite a few thoughts, and would appreciate external wisdom at how to approach them from the larger eng org standpoint.
https://dimakorolev.substack.com/p/publishing-enriched-data-update-events
Dima Korolev
Publishing Enriched Data Update Events
When caching and "poor man's replication" play well together.
👍4
Shared some thoughts on LinkedIn about being more explicit in how the APIs are best to be structured.
My take:
• Vague names are indeed bad,
• Use type system to prevent mistakes,
• Make state machines explicit, and
• Keep data mutations as atomic as possible!
Vague names are indeed bad: It is always a good heuristic to make sure the readers of the code are not surprised by naming conventions. Writers too, but the code is read far more often than written.
Use type system to prevent mistakes: Let's talk about programmatically pushing code to a Git repository, or about programmatically starting Docker containers. There is a natural flow of events: commit then push to git, creating an image, then container, then starting it for Docker. A good API will not make it very clear for the reader at which stage of the process a particular entity is, a good API will also make sure that is physically impossible to use this API the wrong way. First create the commit, then use .push(), but only on the object that is returned by .commit(); same logic applies to Docker containers and the order of their startup / teardown.
Make state machines explicit: The example in the previous paragraph illustrates a trivial logic of A -> B -> C. This approach can be generalized. For instance, if an object can be transformed into one of two incompatible states, with a different set of further legal actions on them, it would be best for these incompatible states to be of their own types. This way both the compiler/linter and the IDE would be quick to highlight what exactly is being performed, as well as to throw an error that what is being attempted is not possible in the first place. Static typing FTW, and this applies not only to general types, but also to dataflows and logical invariants.
Keep data mutations as atomic as possible! This reminds me of the difference between `adduser` and `useradd`. Admins know it by heart, but for the mere mortals it's nothing but an extra source of confusion. If the student can be present or absent in some group or class or what not,
Personally, if I see
Sure,
The code would then read — bear in mind, I'm a C++ guy — along the lines of:
This way, on the type system level, for certain types of classes, for example, forgetting to add `.RequireExam()` or `.NoPrerequisites()` would result in a compile-time error.
Or in a deprecation warning, which is also useful. The deprecation warning at compile time, I mean.
Moreover, once the company is ready to introduce business events, this particular
My take:
• Vague names are indeed bad,
• Use type system to prevent mistakes,
• Make state machines explicit, and
• Keep data mutations as atomic as possible!
Vague names are indeed bad: It is always a good heuristic to make sure the readers of the code are not surprised by naming conventions. Writers too, but the code is read far more often than written.
Use type system to prevent mistakes: Let's talk about programmatically pushing code to a Git repository, or about programmatically starting Docker containers. There is a natural flow of events: commit then push to git, creating an image, then container, then starting it for Docker. A good API will not make it very clear for the reader at which stage of the process a particular entity is, a good API will also make sure that is physically impossible to use this API the wrong way. First create the commit, then use .push(), but only on the object that is returned by .commit(); same logic applies to Docker containers and the order of their startup / teardown.
Make state machines explicit: The example in the previous paragraph illustrates a trivial logic of A -> B -> C. This approach can be generalized. For instance, if an object can be transformed into one of two incompatible states, with a different set of further legal actions on them, it would be best for these incompatible states to be of their own types. This way both the compiler/linter and the IDE would be quick to highlight what exactly is being performed, as well as to throw an error that what is being attempted is not possible in the first place. Static typing FTW, and this applies not only to general types, but also to dataflows and logical invariants.
Keep data mutations as atomic as possible! This reminds me of the difference between `adduser` and `useradd`. Admins know it by heart, but for the mere mortals it's nothing but an extra source of confusion. If the student can be present or absent in some group or class or what not,
.Add() and .Remove() are perfectly fine names to my taste.Personally, if I see
.Enroll() instead of .Add(), I will right away begin thinking of what are the omitted side effects of this operation. Maybe some other data structure (or some other database!) now has a record that in order to graduate this student must pass this test or take this exam? Or maybe this student now has extra prerequisites to complete before being eligible to begin the next year? My experience, as the reader of the code, will be suboptimal, because a seemingly better word makes me wonder if it means something other than what I think it may mean.Sure,
.Enroll() is, at times, better than .Add(). But an even better way, IMHO, would be to pack the act of enrollment into a single CQRS command, or into a single domain event, if you wish.The code would then read — bear in mind, I'm a C++ guy — along the lines of:
university_db.ApplyMutation(
StudentEnrollmentRequest(some_student)
.IntoClass(some_class)
.RequireExam(true)
.NoPrerequisites());
This way, on the type system level, for certain types of classes, for example, forgetting to add `.RequireExam()` or `.NoPrerequisites()` would result in a compile-time error.
Or in a deprecation warning, which is also useful. The deprecation warning at compile time, I mean.
Moreover, once the company is ready to introduce business events, this particular
StudentEnrollmentRequest is an atomic (!), transactional (!!), and ready-to-be-replayed (!!!) mutation of the data. The response to it, some StudentEnrollmentResult, also a strongly typed message, which can be journaled, into an adjacent stream, with the same offset ID, and these two streams — of commands and of responses — form a 100% event-sourcing-friendly design.Linkedin
Dima Korolev on LinkedIn: Like the sentiment, but I think the example does not convey the message…
Like the sentiment, but I think the example does not convey the message well enough. My take:
• Vague names are indeed bad,
• Use type system to prevent…
• Vague names are indeed bad,
• Use type system to prevent…
👍6❤1
My new public post: Miro & Styra: Battle-testing Enterprise OPA.
Learn how Miro extensively benchmarked Styra’s Enterprise OPA and found it was 10,000 times more performant than the open source container.
#miro #opa #rego #styra
Learn how Miro extensively benchmarked Styra’s Enterprise OPA and found it was 10,000 times more performant than the open source container.
#miro #opa #rego #styra
Crazy thought, but I can't shake it off.
Imagine "Independence Day". Some Jeff Goldblum, with the help of some Will Smith, needs to find a way to reverse-engineer, hack, and then penetrate some alien spacecraft, some older version of which crashed a long time ago in what we now know as some Area 51.
What tools will the software people assigned to this project use?
Linux? Most likely.
Python? Possibly.
Hadoop? Maybe.
Rust / C++ for high-volume data crunching and SQL / Spark for real-time querying? Sounds legit.
But will there be Kafka? I seriously doubt it.
Because if there's a large volume of data that needs to be crunched in real time, we would need a custom-built solution. If the data needs to be crunched at near real time -- even more so. If no real-time updates are needed, just save snapshot files every five seconds, aggregated into every minute, then hour, then day.
So, let me set this straight. Isn't it true that Kafka is solving a problem that is not fundamental to software architecture? In other words, isn't it true that had we had properly designed databases and event buses before Kafka emerged, there would just be no market for Kafka to enter?
And, if so, isn't Kafka today a liability for future generations? As in, could it be that we are already past the point where it makes more sense to try to work around Kafka as opposed to trying to fit into the system if our goal is to build long-lived software?
Just curious.
Imagine "Independence Day". Some Jeff Goldblum, with the help of some Will Smith, needs to find a way to reverse-engineer, hack, and then penetrate some alien spacecraft, some older version of which crashed a long time ago in what we now know as some Area 51.
What tools will the software people assigned to this project use?
Linux? Most likely.
Python? Possibly.
Hadoop? Maybe.
Rust / C++ for high-volume data crunching and SQL / Spark for real-time querying? Sounds legit.
But will there be Kafka? I seriously doubt it.
Because if there's a large volume of data that needs to be crunched in real time, we would need a custom-built solution. If the data needs to be crunched at near real time -- even more so. If no real-time updates are needed, just save snapshot files every five seconds, aggregated into every minute, then hour, then day.
So, let me set this straight. Isn't it true that Kafka is solving a problem that is not fundamental to software architecture? In other words, isn't it true that had we had properly designed databases and event buses before Kafka emerged, there would just be no market for Kafka to enter?
And, if so, isn't Kafka today a liability for future generations? As in, could it be that we are already past the point where it makes more sense to try to work around Kafka as opposed to trying to fit into the system if our goal is to build long-lived software?
Just curious.
🔥1
Is Event Sourcing about historic data or historical data?
ChatGPT
Event sourcing is a design pattern used in software architecture that deals with persisting data as a sequence of events. Instead of storing the current state of the data in a database, it stores the sequence of actions or events that led to that state.
This might seem like a subtle difference, but it has profound implications. Storing the current state only allows you to answer the question "what is the current state?" However, by storing events, you can also answer questions like "how did we get to this state?" and "what is the history of changes to this state?" This ability to navigate through time and understand the history and progression of changes can be very powerful.
So, to answer your question, event sourcing is about historic data in the sense that it records the series of events or changes over time. It doesn't necessarily deal with "historical data" in the sense of historical facts or information, unless the system it's being used on happens to deal with historical information. In general, the term "historic data" refers to a record of past events, which fits the idea of event sourcing quite well.
Looks like GPT-4 is wrong here though. Event logs, and event sourcing in general, are about historical data, much like past stock quotes are.
Historic moments are the special ones, somewhat synonymous with epic, majestic data. If it's the stream that matters, where the order is crucial to maintain, and the events are important, at removing any of them breaks the whole thing.
Interestingly, GPT-4 is making an even more strange statement:
Dima:
Would the Bitcoin ledger be the one of historic data or of historical data?
GPT-4:
The Bitcoin ledger, also known as the blockchain, is essentially a record of historic data ...
"Historical data," in a broader context, often refers to information about the past, often of significant or impactful events, which doesn't quite fit how we think about the Bitcoin ledger, despite the significant impact Bitcoin has had. That said, there might be cases where "historical" could be used to refer to the Bitcoin ledger, especially if looking at it from the perspective of its role in the history of digital currency or blockchain technology.
Personally, I've settled this question for myself by looking at the number of Google hits for {Bitcoin, $TICKER, Dow Jones} "historic data" vs. "historical data". Historical it is for event logs.
Not sure who needs to see this, but as a
If you need modules, one-module-per-line definitions of them in
I used
There are also Github actions in the repo to confirm that the stuff does indeed do what it should. And a few examples are there as well, including typed and untyped JSON operations, as well as an HTTP server.
PS: No
PS2: This
PS3: No, I'm not learning Kotlin. It's a compromise, and the lesser of several evils. Besides, one has to exercise their coding muscle in new languages sometimes, right? I've played with Golang several months ago, and am playing with Kotlin now. Rust next, of course, just looking forward for an opportunity to use it and be paid for it to arise =)
PS4: It's actually quite fun to learn something as terrible as Kotlin with ChatGPT. Far better than with the Google + StackOverflow combo. Might be the novelty effect, but I think it's actually real, and, for once, am looking forward to doing more crap like this in the months to come.
vim user — RIP Bram! — I very much dislike bulky IDEs and heavy build tools. So, here's a small Makefile repo that can run Kotlin code with make.If you need modules, one-module-per-line definitions of them in
deps.txt do the job. A make install target is available to do what npm i or pip3 install -r requirements.txt do. (Hmm, maybe I should rename this target into some make modules or make deps?)I used
mvn behind the scenes, just because it was cleaner to get to work in this "ad-hoc" manner of downloading packages one by one, into a specific directory.There are also Github actions in the repo to confirm that the stuff does indeed do what it should. And a few examples are there as well, including typed and untyped JSON operations, as well as an HTTP server.
PS: No
gradle is not minimalistic to my taste. The amount of boilerplate code to import one dependency is just absurd. It's probably the largest crap-to-value ratio I've seen in a while.PS2: This
kotlin-stdlib.jar file not found by default just sucks. How come there's no kotlinc --print-kotlin-stdlib-jar-path is beyond me. I just ended up searching for it in a few places as part of the make process. If you are on a Mac, pointers appreciated.PS3: No, I'm not learning Kotlin. It's a compromise, and the lesser of several evils. Besides, one has to exercise their coding muscle in new languages sometimes, right? I've played with Golang several months ago, and am playing with Kotlin now. Rust next, of course, just looking forward for an opportunity to use it and be paid for it to arise =)
PS4: It's actually quite fun to learn something as terrible as Kotlin with ChatGPT. Far better than with the Google + StackOverflow combo. Might be the novelty effect, but I think it's actually real, and, for once, am looking forward to doing more crap like this in the months to come.
GitHub
GitHub - dkorolev/kt-makefile: Simplified Kotlin builds for the "no-IDE" setup.
Simplified Kotlin builds for the "no-IDE" setup. Contribute to dkorolev/kt-makefile development by creating an account on GitHub.
👍1
I have come to the point when I universally admit the value of GPT models for myself.
(For the record: I have seen their value as a business, as well as their value for the humankind, from day one. But I am well aware my personal needs are often far from standard, so the very statement I am making is, in a way, quite stronger!)
For queries that are either evergreen ("Locke's views", "Prove asymptotic complexity of comparison sort") or have to do with well-established domain ("Traverse a JSON object recursively in Kotlin", "Dockerfile to switch away from the root user"), its answers are very much spot on.
I have yet to form an opinion on whether GPT models are useful at replacing humans. But as a savvy colleague who's well educated and always available for a quick advice, it's irreplaceable for me at the moment. For each of such queries, it literally saves me at least minutes, and, more likely, dozens of minutes of collating data from various online sources.
So, yes, GPT models can and should kill search engines. In fact, they can do it "as is" already, as long as they are kept up to date.
The very "bias towards too popular, and often untrue content" is pure BS. That's how the world works. Whether or not GPT models can spark a positive against-fake-news and pro-true statements revolution, and/or whether being able to correctly cite sources and to properly rank them by their own biases and track records -- that's besides the point to me.
As a human being interacting with such a system, I will always use my own judgement regardless; and my own judgement will be biased all the time, that's normal and healthy and human nature overall. The right question to postulate is not whether GPT models can be trained in a neutral way; the right question is whether they can be trained to be less biased than what modern-day search engines tell us, and not in a vacuum but to a conscious human being who is exercising their judgement while interpreting the results. And, evidently, the bar of beating, say, Google, in being unbiased is quite low, and I'd comfortably bet on GPT models beating this bar already.
Last but not least: While I am not a huge fan of regulation, it does worry me that we may live in the world where such technology is only available for select few, and/or is severely restricted or censored for the majority of the human population. The modern trends of open-sourcing more and more models are, of course, amazing, and I do hope this is a false alarm.
Still, in the back of my mind, I am terrified by the prospect that in some twenty years from now building one's own LLM at home would be a matter of a single-thousands-dollars and single-days or single-weeks amount of investment, and yet in many parts of the world doing so might be against the law. Much like, for instance, it was and remains against the law in certain places to possess certain books. Or to keep an ML model trained to remove clothing.
As a classical liberal, I see absolutely no harm that can possibly be dealt to a member of a healthy society if they get access to a certain book; same goes to them having access to a certain ML model. Yet, the practice of banning books is real. Well-trained GPT models may well (and absolutely will!) make banning books per se ineffective. Thus the fear of the strong & powerful of our species to try to control these models is, at least, not groundless.
Of course, Generative Pre-training Transformer models are just the tip of the iceberg. What I am saying above applies to future kinds of models as well.
Also, note that I am saying nothing of consciousness, as well as of the "motivation" of those models. That's a totally separate topic. All I am saying is that every single human being having access to a somewhat accurately-restored author or historical figure, so that instead of speculating what they may have meant we may well just ask it to outline an argument on any topic -- that is ultimately a Good Thing. And the "danger" that may stem from people realizing what real people used to think about real issue is only a
(For the record: I have seen their value as a business, as well as their value for the humankind, from day one. But I am well aware my personal needs are often far from standard, so the very statement I am making is, in a way, quite stronger!)
For queries that are either evergreen ("Locke's views", "Prove asymptotic complexity of comparison sort") or have to do with well-established domain ("Traverse a JSON object recursively in Kotlin", "Dockerfile to switch away from the root user"), its answers are very much spot on.
I have yet to form an opinion on whether GPT models are useful at replacing humans. But as a savvy colleague who's well educated and always available for a quick advice, it's irreplaceable for me at the moment. For each of such queries, it literally saves me at least minutes, and, more likely, dozens of minutes of collating data from various online sources.
So, yes, GPT models can and should kill search engines. In fact, they can do it "as is" already, as long as they are kept up to date.
The very "bias towards too popular, and often untrue content" is pure BS. That's how the world works. Whether or not GPT models can spark a positive against-fake-news and pro-true statements revolution, and/or whether being able to correctly cite sources and to properly rank them by their own biases and track records -- that's besides the point to me.
As a human being interacting with such a system, I will always use my own judgement regardless; and my own judgement will be biased all the time, that's normal and healthy and human nature overall. The right question to postulate is not whether GPT models can be trained in a neutral way; the right question is whether they can be trained to be less biased than what modern-day search engines tell us, and not in a vacuum but to a conscious human being who is exercising their judgement while interpreting the results. And, evidently, the bar of beating, say, Google, in being unbiased is quite low, and I'd comfortably bet on GPT models beating this bar already.
Last but not least: While I am not a huge fan of regulation, it does worry me that we may live in the world where such technology is only available for select few, and/or is severely restricted or censored for the majority of the human population. The modern trends of open-sourcing more and more models are, of course, amazing, and I do hope this is a false alarm.
Still, in the back of my mind, I am terrified by the prospect that in some twenty years from now building one's own LLM at home would be a matter of a single-thousands-dollars and single-days or single-weeks amount of investment, and yet in many parts of the world doing so might be against the law. Much like, for instance, it was and remains against the law in certain places to possess certain books. Or to keep an ML model trained to remove clothing.
As a classical liberal, I see absolutely no harm that can possibly be dealt to a member of a healthy society if they get access to a certain book; same goes to them having access to a certain ML model. Yet, the practice of banning books is real. Well-trained GPT models may well (and absolutely will!) make banning books per se ineffective. Thus the fear of the strong & powerful of our species to try to control these models is, at least, not groundless.
Of course, Generative Pre-training Transformer models are just the tip of the iceberg. What I am saying above applies to future kinds of models as well.
Also, note that I am saying nothing of consciousness, as well as of the "motivation" of those models. That's a totally separate topic. All I am saying is that every single human being having access to a somewhat accurately-restored author or historical figure, so that instead of speculating what they may have meant we may well just ask it to outline an argument on any topic -- that is ultimately a Good Thing. And the "danger" that may stem from people realizing what real people used to think about real issue is only a
👍3❤1
"danger" in a dysfunctional society, whereas the society I would like to help to push forward is pretty much the opposite of a dysfunctional one.
PS: "Summarize as a table" is awesome, and "Summarize as a JSON" is even more useful for data collation purposes if later on this dataset is to be used programmatically. You're welcome.
PS: "Summarize as a table" is awesome, and "Summarize as a JSON" is even more useful for data collation purposes if later on this dataset is to be used programmatically. You're welcome.
👍1
For decades now, I keep talking about the pain of not being able to perform atomic "transactions" with the contents of the file system.
Something as trivial as "as long as these files exist, and as long as this pattern matches exactly this number of files, perform this atomic rename of these files" would be awesome to have.
If this would be supported by RAIDs, database design would become far more straightforward. (Okay, we'll need to standardize what
Now, in the world of S3, the ask of transactionality is even more valid. A lot of SysDesign-s become very clean if the cloud filesystem could be used in a transactional way, albeit with a very limited scope of what these transactions could offer.
Maybe it's a worthy standalone project? Build a wrapper around S3 to enable simple "transactions" and to guarantee their atomicity, isolation, and linearazability? 🤔
Something as trivial as "as long as these files exist, and as long as this pattern matches exactly this number of files, perform this atomic rename of these files" would be awesome to have.
If this would be supported by RAIDs, database design would become far more straightforward. (Okay, we'll need to standardize what
fsync actually does, but that's another can of worms.)Now, in the world of S3, the ask of transactionality is even more valid. A lot of SysDesign-s become very clean if the cloud filesystem could be used in a transactional way, albeit with a very limited scope of what these transactions could offer.
Maybe it's a worthy standalone project? Build a wrapper around S3 to enable simple "transactions" and to guarantee their atomicity, isolation, and linearazability? 🤔