DevOps&SRE Library
17.8K subscribers
461 photos
4 videos
2 files
4.76K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://knd.gov.ru/license?id=67704b536aa9672b963777b3&registryType=bloggersPermission
Download Telegram
capsule

Capsule implements a multi-tenant and policy-based environment in your Kubernetes cluster. It is designed as a micro-services-based ecosystem with the minimalist approach, leveraging only on upstream Kubernetes.


https://github.com/projectcapsule/capsule
mailpit

Mailpit is a small, fast, low memory, zero-dependency, multi-platform email testing tool & API for developers.

It acts as an SMTP server, provides a modern web interface to view & test captured emails, and includes an API for automated integration testing.


https://github.com/axllent/mailpit
kaytu

The Kaytu CLI improves the efficiency of cloud workloads by analyzing historical usage and providing tailored recommendations, such as changing instance sizes. This ensures you only pay for the resources you actually need without compromising stability.


https://github.com/kaytu-io/kaytu
terraform-plan-comment

GitHub Action to post the output of "terraform plan" to a pull request comment.


https://github.com/borchero/terraform-plan-comment
Optimize Kubernetes Pods’ Startup Time Using VolumeSnapshots

In this blog post, you will learn how we used VolumeSnapshots to significantly reduce the startup times of static data sources-based applications, specifically within AWS environments.


https://medium.com/riskified-technology/optimize-kubernetes-pods-startup-time-using-volumesnapshots-c0a2b7d39a29
Behind the scenes of Vercel's infrastructure: Achieving optimal scalability and performance

Learn how Vercel builds and deploys serverless applications.


https://vercel.com/blog/behind-the-scenes-of-vercels-infrastructure
unleash

Unleash is a powerful open source solution for feature management. It streamlines your development workflow, accelerates software delivery, and empowers teams to control how and when they roll out new features to end users. With Unleash, you can deploy code to production in smaller, more manageable releases at your own pace.


https://github.com/Unleash/unleash
Don’t Get Lost in the Metrics Maze: A Practical Guide to SLOs, SLIs, Error Budgets, and Toil

https://medium.com/@lokesh12/dont-get-lost-in-the-metrics-maze-a-practical-guide-to-slos-slis-error-budgets-and-toil-939ecd0181eb
SLA vs SLO vs SLI: What’s the Difference?

https://www.checklyhq.com/blog/sla-slo-sli
BPFAgent: eBPF for Monitoring at DoorDash

As DoorDash experienced rapid growth over the last few years, we began to see the limits of our traditional methods of monitoring. Metrics, logs, and traces provide vital information about our service ecosystem. But these signals almost entirely rely on application-level instrumentation, which can leave gaps or conflicting semantics across different systems. We decided to seek potential solutions that could provide a more complete and unified picture of our networking topology.

One of these solutions has been monitoring with eBPF, which allows developers to write programs that are injected directly into the kernel and can trace kernel operations. These programs, designed to provide lightweight access to most components of the kernel, are sandboxed and validated for safety by the kernel before execution. DoorDash was particularly interested in tracing network traffic via hooks called kprobes (kernel dynamic tracing) and tracepoints. With these hooks, we can intercept and understand TCP and UDP connections across our multiple Kubernetes clusters.

By building at the kernel level, we can monitor network traffic at the infrastructure level, which gives us new insights into DoorDash’s backend ecosystem that’s independent of the service workflow.

To run these eBPF probes, we have developed a Golang application called BPFAgent, which we run as a daemonset in all of our Kubernetes clusters. Here we will take a look at how we built BPFAgent, the process of building and maintaining its probes, and how various DoorDash teams have used the data collected.


https://doordash.engineering/2023/08/15/bpfagent-ebpf-for-monitoring-at-doordash
Terraform - Understanding Count and For_Each Loops

https://dev.to/pwd9000/terraform-understanding-count-and-foreach-loops-c6i
symphony

Symphony is a framework and set of patterns and best practices for developing, testing, and deploying infrastructure on Azure using Infrastructure as Code (IAC.) It includes modern DevOps practices for IAC such as Main and Pull Request workflows, IaC Code Validation & Linting, Automated Testing, Security Scanning, Multi-environment deployments, modules dependencies and more.


https://github.com/microsoft/symphony
mlinfra

mlinfra is the swiss army knife for deploying scalable MLOps infrastructure. It aims to make MLOps infrastructure deployment easy and accessible to all ML teams by liberating IaC logic for creating MLOps stacks which is usually tied to other frameworks.


https://github.com/mlinfra-io/mlinfra
Presenting to Engineering Leadership

A 5 slide formula with some advice.


https://hross.substack.com/p/presenting-to-engineering-leadership
It’s always TCP_NODELAY. Every damn time.

https://brooker.co.za/blog/2024/05/09/nagle.html
When Kubernetes and Go don't work well together

Go is not aware of the limits set for its container, causing some issues not easy to track. This is a story about how I stumbled into one of them.


https://lalatron.hashnode.dev/when-kubernetes-and-go-dont-work-well-together
asdf

asdf is a CLI tool that can manage multiple language runtime versions on a per-project basis. It is like gvm, nvm, rbenv & pyenv (and more) all in one! Simply install your language's plugin!


https://github.com/asdf-vm/asdf
superfile

Pretty fancy and modern terminal file manager


https://github.com/yorukot/superfile
openpanel

Openpanel is a simple analytics tool for logging events on web, apps and backend.


https://github.com/Openpanel-dev/openpanel