DevOps&SRE Library
17.8K subscribers
459 photos
4 videos
2 files
4.75K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://knd.gov.ru/license?id=67704b536aa9672b963777b3&registryType=bloggersPermission
Download Telegram
Amazon: NOT OK - why we had to change Elastic licensing

https://www.elastic.co/blog/why-license-change-AWS
A Vim Guide for Intermediate Users

https://thevaluable.dev/vim-intermediate
Provisioning Kubernetes clusters on GCP with Terraform and GKE

TL;DR: In this article you will learn how to create clusters on the GCP Google Kubernetes Engine (GKE) with the gcloud CLI and Terraform. By the end of the tutorial, you will automate creating three clusters (dev, staging, prod) complete with the GKE Ingress in a single click.

https://learnk8s.io/terraform-gke
image-service

Dragonfly image service, providing fast, secure and easy access to container images.

https://github.com/dragonflyoss/image-service
Run Kubernetes Production Environment on EC2 Spot Instances With Zero Downtime: A Complete Guide

https://medium.com/riskified-technology/run-kubernetes-on-aws-ec2-spot-instances-with-zero-downtime-f7327a95dea
The Day of the RDS Multi-AZ Failover

On a fateful Friday evening on December 2019, when a few of us were looking forward to packing their bags and going home, we got an alert from the internal monitoring tool that the system has started throwing unusually high numbers of 5xx errors.

https://razorpay.com/blog/day-of-rds-multi-az-failover
Google Cloud vs AWS in 2021 (Comparing the Giants)

Today, we will be comparing two cloud giants, Google Cloud Platform and Amazon Web Services. We’ll be taking a deep dive into the products and services of each provider. Seeking to add clarity and simplify the process comparing these two cloud providers in order to make an informed decision.

https://kinsta.com/blog/google-cloud-vs-aws
Производительность распределенного хранилища: препродакшен тесты

У вас есть свежее распределенное хранилище. Кластер уже установлен и готов к вводу в продакшен. Самое время протестировать производительность. Такое тестирование проводится чтобы понять скорость работы хранилки на практике, оценить адекватность инсталляции и понять её максимальную производительности на старте. В этой статье я поделюсь методологией препродакшен тестирования.

https://alexzzz.ru/post/storage-preproduction-perf-test
The Next Gen Database Servers Powering Let's Encrypt

Dell’s PowerEdge R7525
CPU: 2x AMD EPYC 7542 - Total 64 cores / 128 threads
Memory: 2TB 3200MT/s
Storage: 24x 6.4TB Intel P4610, NVMe SSD, 3200/3200 MB/s read/write

https://letsencrypt.org/2021/01/21/next-gen-database-servers.html
chisel

Chisel is a fast TCP/UDP tunnel, transported over HTTP, secured via SSH. Single executable including both client and server. Written in Go (golang). Chisel is mainly useful for passing through firewalls, though it can also be used to provide a secure endpoint into your network.

https://github.com/jpillora/chisel
Campaigns

Sometimes it can take years to make a single-line code change.

https://kellysutton.com/2021/01/06/campaigns.html
please

Please is a cross-language build system with an emphasis on high performance, extensibility and reproducibility. It supports a number of popular languages and can automate nearly any aspect of your build process.

https://github.com/thought-machine/please
kubekey

Since v3.0.0, KubeSphere changes the ansible-based installer to the new installer called KubeKey that is developed in Go language. With KubeKey, you can install Kubernetes and KubeSphere separately or as a whole easily, efficiently and flexibly.

https://github.com/kubesphere/kubekey
May 30 SSL incident

Summary and key takeaways

- Two root certification authorities expired on May 30, 2020.
- Some of our customers experienced service outages for up to 1.5 hours (if they had outdated OpenSSL libraries), and others up to 3 hours (if they also had outdated certificate stores).
- The issue has been fully mitigated, and the service availability was restored for everyone. Although related to OpenSSL, HTTPS and PKI certificates, this was not a security incident.

https://www.algolia.com/blog/engineering/may-30-ssl-incident
97 things every SRE should know - Part 01

A few people I follow on twitter mentioned they’d contributed to 97 Things Every SRE Should Know. It’s a book full of short, 1-3 page chapters, focused on topics dear to an SREs heart. So i had no choice but to buy it. In an attempt to be more deliberate with my reading and what I’ve retained from the book I’ve decided to create some reading notes for future me. This post is broken down into a section per chapter.

https://www.unixdaemon.net/sysadmin/97-things-every-sre-01
This Is the Most Underappreciated Skill for SREs

https://www.blameless.com/blog/the-most-underappreciated-skill-for-sres