DevOps&SRE Library
17.8K subscribers
458 photos
4 videos
2 files
4.75K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://knd.gov.ru/license?id=67704b536aa9672b963777b3&registryType=bloggersPermission
Download Telegram
Series "Kubernetes Capacity Management"

Evolution of Capacity Management: From Bare Metal to Kubernetes:
https://mkdev.me/posts/evolution-of-capacity-management-from-bare-metal-to-kubernetes

Kubernetes Capacity and Resource Management: It's Not What You Think It Is: https://mkdev.me/posts/kubernetes-capacity-and-resource-management-it-s-not-what-you-think-it-is

Kubernetes Is Not an Orchestrator: The Jump to Universality for Infrastructure Abstractions: https://mkdev.me/posts/kubernetes-is-not-an-orchestrator-the-jump-to-universality-for-infrastructure-abstractions
Unreadable Metrics: Why You Can’t Find Anything in Your Monitoring Dashboards

A Guide to Effective Dashboard Design for DevOps and SRE

https://horovits.medium.com/unreadable-metrics-why-you-cant-find-anything-in-your-monitoring-dashboards-12fcc23d34c8
Open-Source Tracing Tools: Jaeger Vs. Zipkin Vs. Grafana Tempo

Distributed tracing is crucial for monitoring complex systems. This article covers the three most popular open-source tracing tools: Jaeger, Zipkin, and Grafana Tempo.

https://codersociety.com/blog/articles/jaeger-vs-zipkin-vs-tempo
Refactoring CI/CD for a Moderately Large C++ Code Base

Fundamentally, Dagger is a really awesome set of language bindings on top of buildkit – the library that tools like docker and buildah use to build container images. What is this means is that there is an easy way to build code in an isolated fashion with great caching, easy to write parallelism, remote execution, ability to spin up on-demand side car services, and portable abstractions.

https://robertu94.github.io/2023/04/24/refactoring-ci/cd-for-a-moderately-large-c-code-base.html
milvus

A cloud-native vector database, storage for next generation AI applications

https://github.com/milvus-io/milvus
Real-World GitOps with Flux, Flagger, and Linkerd

https://linkerd.io/2023/05/15/real-world-gitops
How to create a cron job docker container using AWS ECS, Fargate, fully automated with Terraform

https://noiselesstech.net/2023/05/04/how-to-create-a-cron-job-docker-container-using-aws-ecs-fargate-fully-automated-with-terraform
Terraform automation for teams. Purpose-built for GitHub

Start running Terraform with cost estimation, security alerts, drift detection, access controls, and OPA policy testing. All within GitHub's UI.

https://terrateam.io
terrap-cli

Terrap - a powerful CLI tool that scans your infrastructure and identifies any required changes.

https://github.com/sirrend/terrap-cli
tfc-workflows-github

This repo includes prescriptive workflows that implement best practices when interacting with Terraform Cloud. These starter workflow templates provide a entrypoint to integrate your CI/CD pipelines with Terraform Cloud.

https://github.com/hashicorp/tfc-workflows-github
IaC & GitOps with EKS blueprints

TLDR; Need a cluster up and running fast? Take a close look at eks-blueprints, I got started in minutes and have been working with it for almost 2 years now.

https://medium.com/everything-full-stack/iac-gitops-with-eks-blueprints-7a28ad1f702a
Percentiles don’t work: Analyzing the distribution of response times for web services

https://adrianco.medium.com/percentiles-dont-work-analyzing-the-distribution-of-response-times-for-web-services-ace36a6a2a19
etcd: getting 30% more write/s

My team at Zendesk look after around 30 Kubernetes clusters. These are all self managed, meaning we maintain the API servers, and as you may guess: etcd.

Recently, I had a task to do some performance analysis on our etcd clusters. It had been a while since we ran any sort of benchmarking. Plus I wanted to get my hands dirty as I haven’t got much experience in tuning databases.

While I ended up getting about a 30% increase in performance, I learnt a lot about how databases, and by extension; how disks work together.

https://zendesk.engineering/etcd-getting-30-more-write-s-318bcdbf7774
How to Monitor CoreDNS

CoreDNS is a DNS add-on for Kubernetes environments. It is one of the components running in the control plane nodes, and having it fully operational and responsive is key for the proper functioning of Kubernetes clusters. Learning how to monitor CoreDNS, and what its most important metrics are, is a must for operations teams.

https://sysdig.com/blog/how-to-monitor-coredns
Managing Grafana Dashboards With Terraform

We’ve all done it — deleted a graph from a dashboard, realised we still need it but have forgotten the query. Use Terraform to go back in time and save yourself the headache

https://betterprogramming.pub/managing-grafana-dashboards-with-terraform-ad49ff6bb552
The DevOps Hangover

The greatest irony is that DevOps aimed to help developers talk with operations, and vice versa, yet developers are even further away from understanding how their code operates at runtime.

https://www.linkedin.com/pulse/devops-hangover-pete-cheslock