Continuous Integration
https://martinfowler.com/articles/continuousIntegration.html
Continuous Integration is a software development practice where each member of a team merges their changes into a codebase together with their colleagues changes at least daily. Each of these integrations is verified by an automated build (including test) to detect integration errors as quickly as possible. Teams find that this approach reduces the risk of delivery delays, reduces the effort of integration, and enables practices that foster a healthy codebase for rapid enhancement with new features.
https://martinfowler.com/articles/continuousIntegration.html
prodzilla
https://github.com/prodzilla/prodzilla
Prodzilla is a modern synthetic monitoring tool built in Rust. It's focused on surfacing whether existing behaviour in production is as expected in a human-readable format, so that stakeholders, or even customers, can contribute to system verification.
https://github.com/prodzilla/prodzilla
(Almost) Every infrastructure decision I endorse or regret after 4 years running infrastructure at a startup
https://cep.dev/posts/every-infrastructure-decision-i-endorse-or-regret-after-4-years-running-infrastructure-at-a-startup
I’ve led infrastructure at a startup for the past 4 years that has had to scale quickly. From the beginning I made some core decisions that the company has had to stick to, for better or worse, these past four years. This post will list some of the major decisions made and if I endorse them for your startup, or if I regret them and advise you to pick something else.
https://cep.dev/posts/every-infrastructure-decision-i-endorse-or-regret-after-4-years-running-infrastructure-at-a-startup
Infrastructure Pipeline
https://medium.com/@tusharmurudkar/devops-infrastructure-pipeline-beab47e7b876
https://medium.com/@tusharmurudkar/devops-infrastructure-pipeline-beab47e7b876
How to have multiple Terraform deployments with the same GitHub Action
https://medium.com/@robbiedouglas/how-to-have-multiple-terraform-deployments-with-the-same-github-action-043f082f76e2
https://medium.com/@robbiedouglas/how-to-have-multiple-terraform-deployments-with-the-same-github-action-043f082f76e2
Fallback
https://blog.alexewerlof.com/p/fallback
What is it? How does it work? When to use it and when not to use it?
https://blog.alexewerlof.com/p/fallback
How to reduce expenses on monitoring: Swapping in VictoriaMetrics for Prometheus
https://victoriametrics.com/blog/reducing-costs-p1/index.html
Monitoring can get expensive due to the huge quantities of data that need to be processed. In this blog post, you’ll learn the best ways to store and process monitoring metrics to reduce your costs, and how VictoriaMetrics can help.
This blog post will only cover open-source solutions. VictoriaMetrics is proudly open source. You’ll get the most out of this blog post if you are familiar with Prometheus, Thanos, Mimir or VictoriaMetrics.
https://victoriametrics.com/blog/reducing-costs-p1/index.html
Reliability Engineering Mindset
https://blog.alexewerlof.com/p/book-intro-reliability-engineering
Subtitle: Concepts, Patterns, and Tools for building, maintaining, and evolving reliable software products
https://blog.alexewerlof.com/p/book-intro-reliability-engineering
m3
https://github.com/m3db/m3
Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform
https://github.com/m3db/m3
victorialogs-datasource
https://github.com/VictoriaMetrics/victorialogs-datasource
Grafana datasource for VictoriaLogs
https://github.com/VictoriaMetrics/victorialogs-datasource
Slack’s Migration to a Cellular Architecture
https://slack.engineering/slacks-migration-to-a-cellular-architecture
In recent years, cellular architectures have become increasingly popular for large online services as a way to increase redundancy and limit the blast radius of site failures. In pursuit of these goals, we have migrated the most critical user-facing services at Slack from a monolithic to a cell-based architecture over the last 1.5 years. In this series of blog posts, we’ll discuss our reasons for embarking on this massive migration, illustrate the design of our cellular topology along with the engineering trade-offs we made along the way, and talk about our strategies for successfully shipping deep changes across many connected services.
https://slack.engineering/slacks-migration-to-a-cellular-architecture
Everything I know about SSDs
https://kcall.co.uk/ssd/index.html
Solid State Devices using NAND Flash, how they differ from Hard Drives, and how they affect file deletion and recovery
https://kcall.co.uk/ssd/index.html
Software releases notification system - don't waste your time checking if some software is updated
https://newreleases.io
https://newreleases.io
Cassandra Unleashed: How We Enhanced Cassandra Fleet’s Efficiency and Performance
https://doordash.engineering/2024/01/30/cassandra-unleashed-how-we-enhanced-cassandra-fleets-efficiency-and-performance
In this blog post, we walk through DoorDash’s Cassandra optimization journey. I will share what we learned as we made our fleet much more performant and cost-effective. Through analyzing our use cases, we hope to share universal lessons that you might find useful. Before we dive into those details, let’s briefly talk about the basics of Cassandra and its pros and cons as a distributed NoSQL database.
https://doordash.engineering/2024/01/30/cassandra-unleashed-how-we-enhanced-cassandra-fleets-efficiency-and-performance
Devpod: Improving Developer Productivity at Uber with Remote Development
https://www.uber.com/en-MX/blog/devpod-improving-developer-productivity-at-uber
In this blog, we share how we improved the daily edit-build-run developer experience using DevPods, our remote development environment. We will start with some of the initial challenges, the pain points we addressed with Devpod, our architecture, and some of our recent successes in terms of adoption and cost reduction. We will finally leave you with some thoughts around the future of remote development at Uber.
https://www.uber.com/en-MX/blog/devpod-improving-developer-productivity-at-uber
Rock-Solid K3s on Oracle Cloud Infrastructure
P1: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-1-dbfeaa69d670
P2: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-2-4f7b95faca88
P3: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-3-129efce08b81
P4: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-4-bc47b20e38a6
P1: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-1-dbfeaa69d670
P2: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-2-4f7b95faca88
P3: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-3-129efce08b81
P4: https://medium.com/oracledevs/rock-solid-k3s-on-oci-part-4-bc47b20e38a6
Manage Multiple Kubernetes Clusters with ArgoCD
https://piotrminkowski.com/2022/12/09/manage-multiple-kubernetes-clusters-with-argocd
https://piotrminkowski.com/2022/12/09/manage-multiple-kubernetes-clusters-with-argocd