I've published a longread about Event Sourcing, in the "Gentle Introduction" way.

The introduction is gentle both from the engineering standpoint and from the grounds of convincing the business to begin making the first steps.

If you struggle with justifying it to the stakeholders that the company needs to upgrade its data stack, or if you already have a soft buy-in but are unsure how to best proceed -- I hope you'd find this post helpful and insightful.
https://open.substack.com/pub/dimakorolev/p/the-oltp-language -- collected my thoughts on how can we accelerate the retirement of SQL as the meant to implement OLTP transactions.
👍2
Just clicked "Publish" on two posts that were in the works for well over a month. Behold:

Distributed Stateful Workflows, and
Stateful Orchestration Engines.

Hope you enjoy reading them as much as I enjoyed writing them. And spread the word!
4
The episode on Stateful Orchestration is out!

In reality this episode sets the table for proper introduction into the domain of stateful orchestration. I talk in detail about many SysDesign problems leading to formulating the problem, from the idea of serialization and vector clocks, through decoupling of producers and consumers, and all the way to Cadence and Temporal as generic mechanisms to enable SAGAs and more sophisticated patterns at scale.
👍6
Durable Execution, a.k.a. Stateful Orchestration 2.0.

The closing meetup episode of 2024 is now released.

https://tinyurl.com/sdm-durable-execution
https://tinyurl.com/sdm-durable-execution-slides

Would be great to build this together in 2025 and beyond!
🔥3❤‍🔥1
#Milestone

I guess ... thank you — and looking forward to much more!
🔥10🎉1
😁41🤗1
Folks, Maxim Fateev, the founder of Temporal has a super discount code — CSDM80 — for those of us who want to attend Replay 2025, https://replay.temporal.io/.

It's literally 80% off. Definitely good enough in my book to share with the community. Hope to see you there!
🔥1
Realistically, it looks like we'd do well to just build an open-source comparison tool for a few most popular cluster (i.e. decentralized) databases.

The list so far is quite large and growing:

⒈ FoundationDB
⒉ Yugabyte
⒊ CockroachDB
⒋ YDB

The test may well be the trivial "Alice pays Bob, no over-spending" app.

The cluster DB should run on three/five/seven nodes, an odd number.

Client nodes would be sending traffic either to all nodes at random, or to just one of them.

Failing nodes should also be tested, as well as slow/deteriorating network.

All can be done with docker-compose.

Invaluable experience, useful insights, open source code, a meetup episode or two of making this happen. All the good stuff.

Best way to build this, I think, would be to have think wrappers for each DB to a) start it, and b) provide some unified API layer for the above operations, so that each DB node is one-to-one paired with a thin API node.

So that the tester code can be independent of what DB is being used. Including sending the requests, verifying correctness of responses, measuring throughput/latency, and turning off the nodes and/or slowing down traffic between them at random.

Who's excited?
👍3
The lightning talks event from this week's SysDesignMeetup is out: https://tinyurl.com/sdm-lightningtalks-feb2025

It has three lightning talks.

• By David Archuleta, on Wasm and perfomant code in both the browser and node.js,

• By me, on Source-of-Truth, and how it relates to "learning to count", and

• By Alex Kantsevoi, on using etcd to perform leader elections and achieve consensus.

I'd say it's good content, although I'm still getting used to using the Mac and recording the screen from it. Enjoy!
👍5🔥2
I normally don't share upcoming off-the-record events of the SysDesignMeetup, but we had a break of a few months, and thus some extra visibility is warranted.

This Saturday. If you'd like to join, vote for the best time slot with a Slack reacc here.

To get to Slack, https://tinyurl.com/sdm-slack-invite should still work.

More details and links at http://github.com/sysdesignmeetup/sdm.

Hope to see you soon!
2
We're trying to make the Tail at Scale guest talk happen on August the 14th, which is a Thursday.

Morning US time, evening Europe time.

This is the paper, by Jeff Dean and Luiz André Barroso. It covers many topics dear to my heart, such as how to measure latencies properly, and what is request hedging.

If you want to make it, please vote on the best time slot in SysDesignMeetup's #general Slack, channel, and use this link to join if you're not there yet.
1👍1
And here comes the official #announcement.

We have a talk planned on The Tail at Scale paper by Jeff Dean and Luiz André Barroso!

Jordan West is an Engineering Manager at Netflix and a Cassandra committer. He’s worked on large scale deployments of Cassandra for the past 7 years and distributed databases for the past 10. He is passionate about growing the Cassandra community and working on large scale distributed systems.

Save the date, Thursday, August the 14th:

San Francisco, USA 10:00 am PDT
New York, USA 1:00 pm EDT
London, United Kingdom 6:00 pm BST
Amsterdam, Netherlands 7:00 pm CEST
Istanbul, Turkey 8:00 pm EEST

Google Calendar event: link.
See you at http://zoom.dima.ai!
👍43
Published my thoughts on the very idea of durability scaling far beyond the "eleven nines" guarantee people expect these days: https://dimakorolev.substack.com/p/durability-of-distributed-systems
👍2
I stopped myself from writing a long post on Docker, but here's the most interesting part.

First, docker leaks containers.

Consider this inc.sh:

#!/bin/sh
echo $(( $(cat /tmp/n.txt 2>/dev/null) + 1 )) | tee /tmp/m.txt && mv /tmp/m.txt /tmp/n.txt


If you run it locally multiple times it'd print one, two, three, etc.

Now consider this Dockerfile:

FROM alpine
COPY ./inc.sh /inc.sh
ENTRYPOINT ["/inc.sh"]

If you run this command multiple times, it will always print 1:
docker run $(docker build -q .)

It will also always print 1 if you do docker build -t dima . once, followed by docker run dima repeatedly.

Each of these runs yields a new container! It will not show in docker ps or docker container ls, but it will in docker container ls -a.

Alas, the universes of images and containers are easy to confuse.

Behold: docker run --name dima dima

This runs this new container and calls it dima. Now there's dima the image and dima the container.

You can't do docker run --name dima dima again because the container called dima already exists, even though it has terminated a long time ago.

You can re-run it though, just docker start dima.

Second, docker leaks volumes.

Now,
add VOLUME /tmp to the end of the Dockerfile, and re-do the container:

docker container rm dima; docker run --name dima dima

Now run docker start dima several times. And say docker logs dima. Or just run docker start -i dima.

The number will keep increasing.

Because for the very container called dima there now exists a volume!

And if instead of docker start dima you run docker run dima, it will always print 1. And now we know why: because for each of these runs, a new volume is created. And leaked.

The takeaway from this point is that the universe of running-and-stopped containers exists separately from the universe of built-and-possible-tagged images.

And then it's "trivial" to wrap one's head around. Because docker run takes an image, and docker start takes a container.

Third, docker compose silently re-uses containers.

Consider this docker-compose.yml:

services:
dima:
build: .

The third line might as well read image: dima.

Now run docker compose up several times. The number will keep going up!

Because while docker run creates a new container every time, docker compose will create containers once.

The "universe of docker compose container names" also exists. It is the same as the universe of docker containers, but with "tricky" naming. The default is the the parent directory of docker-compose.yml, followed by a minus sign, followed by the name of the service, followed by a minus sign, followed by the index, starting from 1.

Running docker compose down will actually wipe the volume. But who does docker compose down for one-off pipelines, right?

You could also do docker compose run dima. But you would not if your compose topology consists of several containers. Because up is the way to go.

Fourth, and this is bizarre, volumes are not pruned.

Try this:

docker compose up && docker volume prune -f && docker compose up

The command to prune volumes does not prune them!

And there exists no simple way prune all containers tied to a volume. Here's the "shortest" way:

for i in $(docker ps -a -q --filter volume=$VOLUME); do docker container stop $i; docker container rm -f $i; done; docker volume rm $VOLUME

This "one-liner" is literally at the beginning of my scripts that are meant to be fast, self-contained, and reproducible.

PS: docker compose up does not rebuild containers by default. So, unless you truly want to run the older version, docker compose up --build is a safe default.

PS2: Yes, this is why the use of VOLUME is discouraged in Dockerfile-s. But quite a few containers do have VOLUME-s, for instance, the posgres container. So it keep data between runs; what's worth, it keeps tables schemas too. What a wonderful footgun: your app's DB init code is broken but you're blissfully unaware!

If you've learned something today, my half an hour of typing this was not wasted. You're welcome.