DevOps&SRE Library
17.8K subscribers
460 photos
4 videos
2 files
4.76K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://knd.gov.ru/license?id=67704b536aa9672b963777b3&registryType=bloggersPermission
Download Telegram
terraform-target-autocompletion

Press tab after --target and get suggestions for your resources and modules.

terraform-target-autocompletion is a Go program that rely on terraform-config-inspect for the heavy lifting. So it should work with any Terraform version. You don't need anything else than the binary and the completion scripts provided. But currently you'll need Go 1.21.0 installed to build it yourself.

https://github.com/shellwhale/terraform-target-autocompletion
Scaling Kafka to Support PayPal’s Data Growth

Today, our Kafka fleet consists of over 1,500 brokers that host over 20,000 topics and close to 2,000 Mirror Maker nodes which are used to mirror the data among the clusters, offering 99.99% availability for our Kafka clusters. During the 2022 Retail Friday, Kafka traffic volume peaked at about 1.3 trillion messages per day! At present, we have 85+ Kafka clusters, and every holiday season we flex up our Kafka infrastructure to handle the traffic surge. The Kafka platform continues to seamlessly scale to support this traffic growth without any impact to our business.

https://medium.com/paypal-tech/scaling-kafka-to-support-paypals-data-growth-a0b4da420fab
harden-runner

Harden-Runner provides runtime security for GitHub-hosted and self-hosted environments

https://github.com/step-security/harden-runner
How Cloudflare runs Prometheus at scale

At the moment of writing this post we run 916 Prometheus instances with a total of around 4.9 billion time series.

https://blog.cloudflare.com/how-cloudflare-runs-prometheus-at-scale
cf-terraforming

cf-terraforming is a command line utility to facilitate terraforming your existing Cloudflare resources. It does this by using your account credentials to retrieve your configurations from the Cloudflare API and converting them to Terraform configurations that can be used with the Terraform Cloudflare provider.

This tool is ideal if you already have Cloudflare resources defined but want to start managing them via Terraform, and don't want to spend the time to manually write the Terraform configuration to describe them.

https://github.com/cloudflare/cf-terraforming
Understanding Kubernetes Limits and Requests

When working with containers in Kubernetes, it’s important to know what are the resources involved and how they are needed. Some processes will require more CPU or memory than others. Some are critical and should never be starved. 

Knowing that, we should configure our containers and Pods properly in order to get the best of both.

https://sysdig.com/blog/kubernetes-limits-requests
Kubernetes OOM and CPU Throttling

Troubleshooting Memory and CPU problems

https://sysdig.com/blog/troubleshoot-kubernetes-oom
Exit Codes In Containers & Kubernetes – The Complete Guide

https://komodor.com/learn/exit-codes-in-containers-and-kubernetes-the-complete-guide
Deployment previews on Kubernetes

Deployment previews - made popular by platforms like Vercel and Netlify - are not commonplace in microservice architectures. At Blueground, we brought deployment previews to K8s using ArgoCD. Well, it turned out to be so good, it is worth sharing.

https://engineering.theblueground.com/deployment-previews
sre-roadmap

An opinionated roadmap to become an SRE

https://github.com/teivah/sre-roadmap