DevOps&SRE Library
17.8K subscribers
461 photos
4 videos
2 files
4.76K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://knd.gov.ru/license?id=67704b536aa9672b963777b3&registryType=bloggersPermission
Download Telegram
Kubernetes NGINX Ingress Controller: 10+ Complementary Configurations for Web Applications

https://dev.to/zenika/kubernetes-nginx-ingress-controller-10-complementary-configurations-for-web-applications-ken
Kubernetes Custom Controllers Recipes for Beginners

Explaining the most common Kubernetes custom controllers development scenarios that can frustrate you as a beginner.


https://itnext.io/kubernetes-custom-controllers-recipes-for-beginners-bbc286c05ef8
smallab-k8s-pve-guide

A guide series explaining how to setup a personal small homelab running a Kubernetes cluster with VMs on a Proxmox VE standalone server node.


https://github.com/ehlesp/smallab-k8s-pve-guide
snapscheduler

SnapScheduler provides scheduled snapshots for Kubernetes CSI-based volumes.


https://github.com/backube/snapscheduler
k8s-device-plugin

The NVIDIA device plugin for Kubernetes is a Daemonset that allows you to automatically:

- Expose the number of GPUs on each nodes of your cluster
- Keep track of the health of your GPUs
- Run GPU enabled containers in your Kubernetes cluster.


https://github.com/NVIDIA/k8s-device-plugin
kluctl

Kluctl is the missing glue that puts together your (and any third-party) deployments into one large declarative Kubernetes deployment, while making it fully manageable (deploy, diff, prune, delete, ...) via one unified command line interface.


https://github.com/kluctl/kluctl
k8e

Kubernetes Easy Engine(k8e)🚀 is a lightweight, scalable enterprise-grade Kubernetes distribution that allows users to manage, protect and obtain out-of-the-box Kubernetes clusters in a unified manner. It is suitable for enterprise environments.


https://github.com/xiaods/k8e
zed

Code at the speed of thought – Zed is a high-performance, multiplayer code editor from the creators of Atom and Tree-sitter.


https://github.com/zed-industries/zed
heynote

Heynote is a dedicated scratchpad for developers. It functions as a large persistent text buffer where you can write down anything you like. Works great for that Slack message you don't want to accidentally send, a JSON response from an API you're working with, notes from a meeting, your daily to-do list, etc.


https://github.com/heyman/heynote
The Bun Shell

The Bun Shell is a new experimental embedded language and interpreter in Bun that allows you to run cross-platform shell scripts in JavaScript & TypeScript.


https://bun.sh/blog/the-bun-shell
wal-listener

A service that helps implement the Event-Driven architecture.

To maintain the consistency of data in the system, we will use transactional messaging - publishing events in a single transaction with a domain model change.

The service allows you to subscribe to changes in the PostgreSQL database using its logical decoding capability and publish them to the NATS Streaming server.


https://github.com/ihippik/wal-listener
The state of Kubernetes jobs in 2023 Q4

Kubernetes Job market trends for Q4 2023


https://kube.careers/state-of-kubernetes-jobs-2023-q4
42 things I learned from building a production database

https://maheshba.bitbucket.io/blog/2021/10/19/42Things.html
12 Factor CLI Apps

At Heroku, we’ve come up with a methodology called the 12 factor app. It’s a set of principles designed to make great web applications that are easy to maintain. In that spirit, here are 12 CLI factors to keep in mind when building your next CLI application. Following these principles will offer CLI UX that users will love.


https://medium.com/@jdxcode/12-factor-cli-apps-dd3c227a0e46
Viacheslav Biriukov - SRE deep dive into Linux Page Cache

In this series of articles, I would like to talk about Linux Page Cache. I believe that the following knowledge of the theory and tools is essential and crucial for every SRE. This understanding can help both in usual and routine everyday DevOps-like tasks and in emergency debugging and firefighting.


https://biriukov.dev/docs/page-cache/0-linux-page-cache-for-sre
uptrace

Open source APM: OpenTelemetry traces, metrics, and logs


https://github.com/uptrace/uptrace
kubernetes-image-puller

Kubernetes Image Puller is used for caching images on a cluster. It creates a DaemonSet downloading and running the relevant container images on each node.


https://github.com/che-incubator/kubernetes-image-puller
Why Distributed Systems Fail?

Distributed systems are tricky - it's easy to make wrong assumptions that lead to problems down the road. Back in the 90s, computer scientist L. Peter Deutsch identified several common misconceptions, or "fallacies," that trip up engineers working on distributed systems. Surprisingly these fallacies are still relevant today:

1. The Network is Reliable: It's risky to assume networks are 100% reliable. Networks can and do fail in various ways.
2. Latency is Zero: While we might wish our networks had no latency, that's simply not physically possible - even light takes time to travel distances. Ignoring the inevitable delay in data transmission can lead to unrealistic expectations of system performance.
3. Bandwidth is Infinite: This overlooks the physical and practical limitations on data transfer rates.
4. The Network is Secure: No wonder Security is a growing industry. Assuming inherent security can lead to vulnerabilities and oversight in protective measures.
5. Topology Doesn't Change: This neglects the dynamic nature of network configurations.
6. There is One Administrator: A simplification that fails to consider the complexity of managing distributed systems.
7. Transport Cost is Zero: Overlooking the resources required for data movement.
8. The Network is Homogeneous: Ignoring the diversity in network systems and standards.

These fallacies, if not recognized and addressed, can lead to design flaws, performance issues, and security vulnerabilities in distributed systems. In the following sections, we will break down each of these misconceptions, exploring their implications and how to mitigate the risks they pose in real-world applications.


P1: https://www.codereliant.io/why-distributed-systems-fail-1

P2: https://www.codereliant.io/why-distributed-systems-fail-2