How Kubernetes Runs Containers : A Practical Deep Dive
https://blog.esc.sh/kubernetes-containers-linux-processes
Taking a deep dive into how Kubernetes runs containers as Linux processes
https://blog.esc.sh/kubernetes-containers-linux-processes
How Ahrefs Saved US$400M in 3 Years by NOT Going to the Cloud
https://tech.ahrefs.com/how-ahrefs-saved-us-400m-in-3-years-by-not-going-to-the-cloud-8939dd930af8
https://tech.ahrefs.com/how-ahrefs-saved-us-400m-in-3-years-by-not-going-to-the-cloud-8939dd930af8
tigrisfs
https://www.tigrisdata.com/blog/tigrisfs
We're proud to announce the immediate availability of tigrisfs, the native filesystem interface for Tigris. This lets you mount Tigris buckets to your laptops, desktops, and servers so you can use data in your buckets as if it was local. This bridges the gap between the cloud and your machine.
https://www.tigrisdata.com/blog/tigrisfs
octelium
https://github.com/octelium/octelium
Octelium is a free and open source, self-hosted, unified platform for zero trust resource access that is primarily meant to be a modern alternative to remote access VPNs and similar tools.
https://github.com/octelium/octelium
Breaking up a monolith: How we’re unwinding a shared database at scale
https://www.datadoghq.com/blog/engineering/unwinding-shared-database
https://www.datadoghq.com/blog/engineering/unwinding-shared-database
Taming Complexity: HelloFresh’s Playbook for Managing Large-Scale Change
P1: https://engineering.hellofresh.com/taming-complexity-hellofreshs-playbook-for-managing-large-scale-programs-part-1-3-cdf06c5a6ed9
P2: https://engineering.hellofresh.com/taming-complexity-hellofreshs-playbook-for-managing-large-scale-change-part-2-3-516dc3961e26
P3: https://engineering.hellofresh.com/taming-complexity-hellofreshs-playbook-for-managing-large-scale-change-part-3-3-ec0fd8bc6cd9
P1: https://engineering.hellofresh.com/taming-complexity-hellofreshs-playbook-for-managing-large-scale-programs-part-1-3-cdf06c5a6ed9
P2: https://engineering.hellofresh.com/taming-complexity-hellofreshs-playbook-for-managing-large-scale-change-part-2-3-516dc3961e26
P3: https://engineering.hellofresh.com/taming-complexity-hellofreshs-playbook-for-managing-large-scale-change-part-3-3-ec0fd8bc6cd9
Kubernetes List API performance and reliability
https://ahmet.im/blog/kubernetes-list-performance
At my current employer, we use Kubernetes to run hundreds of thousands of bare metal servers, spread over hundreds of Kubernetes clusters. We use Kubernetes beyond officially supported/tested scale limits by running more than 5,000 nodes and over a hundred thousand of pods in a single cluster.1 In these large scale setups, expensive “list” calls on the Kubernetes API are the achilles heel of the control plane reliability and scalability. In this article, I’ll explain which list call patterns pose the most risk, and how recent and upcoming Kubernetes versions are improving the list API performance.
https://ahmet.im/blog/kubernetes-list-performance
ktea
https://github.com/jonas-grgt/ktea
ktea is a tool designed to simplify and accelerate interactions with Kafka clusters.
https://github.com/jonas-grgt/ktea
GitOps: View from a security perspective
https://medium.com/@TechInternals/gitops-view-from-a-security-perspective-a120795b2f17
https://medium.com/@TechInternals/gitops-view-from-a-security-perspective-a120795b2f17
"Best practices" aren't always best for you
https://thefridaydeploy.substack.com/p/best-practices-arent-always-best
https://thefridaydeploy.substack.com/p/best-practices-arent-always-best
SLA vs SLO
https://blog.alexewerlof.com/p/sla-vs-slo
Demystifying the most common misconception in Service Level jargon
https://blog.alexewerlof.com/p/sla-vs-slo
tfautomv
https://github.com/busser/tfautomv
Generate Terraform moved blocks automatically for painless refactoring
https://github.com/busser/tfautomv
When SIGTERM Does Nothing: A Postgres Mystery
https://clickhouse.com/blog/sigterm-postgres-mystery
The ClickPipes team had encountered a bug with logical replication slot creation on Postgres read replicas—specifically, an issue where a query that was already taking hours rather than the few seconds it usually took couldn’t be terminated by any of the usual methods in Postgres, causing customer frustration and risking the stability of production databases. In this blog post, I’ll walk through how I investigated the problem and ultimately discovered it was due to a Postgres bug. We’ll also share how we fixed it and our experience working with the Postgres community.
https://clickhouse.com/blog/sigterm-postgres-mystery
Mastering Postgres Replication Slots: Preventing WAL Bloat and Other Production Issues
https://www.morling.dev/blog/mastering-postgres-replication-slots
https://www.morling.dev/blog/mastering-postgres-replication-slots
Life Altering Postgresql Patterns
https://mccue.dev/pages/3-11-25-life-altering-postgresql-patterns
There is a set of things that you can do when working with a Postgres database which I have found made my and my coworker's lives much more pleasant. Each one is by itself small, but in aggregate have a noticeable effect.
https://mccue.dev/pages/3-11-25-life-altering-postgresql-patterns
Fix a top cause of slow queries in PostgreSQL (no slow query log needed)
https://render.com/blog/postgresql-top-cause-slow-queries
https://render.com/blog/postgresql-top-cause-slow-queries
Postgres query plan visualization tools
https://www.pgmustard.com/blog/postgres-query-plan-visualization-tools
https://www.pgmustard.com/blog/postgres-query-plan-visualization-tools
OpenAI: Scaling PostgreSQL to the Next Level
https://www.pixelstech.net/article/1747708863-openai%3a-scaling-postgresql-to-the-next-level
At the PGConf.dev 2025 Global Developer Conference, Bohan Zhang from OpenAI shared OpenAI’s best practices with PostgreSQL, offering a glimpse into the database usage of one of the most prominent unicorn company.
https://www.pixelstech.net/article/1747708863-openai%3a-scaling-postgresql-to-the-next-level