DevOps&SRE Library
17.8K subscribers
459 photos
4 videos
2 files
4.75K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://knd.gov.ru/license?id=67704b536aa9672b963777b3&registryType=bloggersPermission
Download Telegram
GitHub Actions: Reusability, DRY Principle, Debugging and Fast Feedback

In this article, we will explore some method of workflow debugging and create a reusable workflow.

The DRY principle stands for Don’t Repeat Yourself, and it is a principle of software development that aims at reducing the repetition and code duplication. DRY principle can also apply to workflows and can be relatively easily implemented in Github Actions to avoid duplication.

We will first recap the essentials of GitHub Actions, explore methods of workflow debugging. Then, we will create a composite action to lint Terraform code and compare it with reusable workflows. Finally, we will create a simple reusable workflow to run testing of Terraform modules in parallel.

https://medium.com/@xpiotrkleban/github-actions-reusability-dry-principle-debugging-and-fast-feedback-c810ed87a43f
The Future of Terraform: ClickOps

Every now and then it’s important to step back from what we’re doing and think about the future. At Terrateam, we like to ask a question each quarter to get our gears turning. This quarter we asked:

What will Infrastructure as Code (IaC) look like in five years?

https://terrateam.io/blog/the-future-of-terraform-is-clickops
terrascope

A build orchestrator for terraform monorepos.

This repository contains both the source code for the tool terrascope, as well as a sample monorepo managed by that tool.

https://github.com/spilliams/terrascope
petra

Petra is a lightweight tool that allows to host your own private Terraform registry using Google Cloud Storage as a storage backend.

Petra is not an official Devoteam product and is provided as-is to the community.

https://github.com/devoteamgcloud/petra
Demystifying OOM Killer in Kubernetes: Tracking Down Memory Issues

Unravelling the mysteries of the OOM killer, delve into its inner workings, and learn how to track down memory issues that lead to OOM kills.

https://medium.com/cloud-native-daily/title-demystifying-oom-killer-in-kubernetes-tracking-down-memory-issues-b5a4973fbd56
Performance comparison: GKE vs. EKS

The solid performance of managed Kubernetes platforms is generally regarded as a given and is hardly ever put into question. However, maybe there is a difference in how containers perform on different popular managed Kubernetes platforms. I wanted to take a deeper look and selected the two most popular Kubernetes services we use at Blueshoe for our clients: Amazon Elastic Kubernetes Service (EKS) and the Google Kubernetes Engine (GKE).

https://www.blueshoe.io/blog/performance-comparison-gke-vs-eks
awl

Anywherelan (awl for brevity) is a mesh VPN project, similar to tinc, direct wireguard or tailscale. Awl makes it easy to connect to any of your devices (at the IP protocol level) wherever they are.

https://github.com/anywherelan/awl
sh

A shell parser, formatter, and interpreter with bash support; includes shfmt

https://github.com/mvdan/sh
dkron

Dkron is a distributed cron service, easy to setup and fault tolerant with focus in:

- Easy: Easy to use with a great UI
- Reliable: Completely fault tolerant
- Highly scalable: Able to handle high volumes of scheduled jobs and thousands of nodes

Dkron is written in Go and leverage the power of the Raft protocol and Serf for providing fault tolerance, reliability and scalability while keeping simple and easily installable.

https://github.com/distribworks/dkron
preevy

Preevy is a powerful Command Line Interface (CLI) tool designed to simplify the process of creating ephemeral preview environments. With Preevy, you can easily provision a preview environment for any Docker-Compose application in the cloud.

https://github.com/livecycle/preevy
opencost

OpenCost models give teams visibility into current and historical Kubernetes spend and resource allocation. These models provide cost transparency in Kubernetes environments that support multiple applications, teams, departments, etc.

OpenCost was originally developed and open sourced by Kubecost. This project combines a specification as well as a Golang implementation of these detailed requirements.

https://github.com/opencost/opencost
Automated deployment of terraform modules in different AWS regions

If you have created terraform modules and want to deploy them in different AWS regions then this is the right place.

This blog covers:

How to provision modules in multiple AWS regions using Terraform?
Other possible options.

https://awstip.com/automated-deployment-of-terraform-modules-in-different-aws-regions-a3101da51a1c
Managing Terraform Modules in a Monorepo

A solution for versioning multiple Terraform module while preserving your Monorepo

https://medium.com/@hello_9187/managing-terraform-modules-in-a-monorepo-e7e89d124d4a
Automating alert 🚨 creation with Terraform config-driven import in Google Cloud ☁️

https://medium.com/google-cloud/automating-alert-creation-with-terraform-config-driven-import-in-google-cloud-%EF%B8%8F-1c9093ddd79f
terraform-graph-beautifier

Command line tool allowing to convert the barely usable output of the terraform graph command to something more meaningful and explanatory.

https://github.com/pcasteran/terraform-graph-beautifier
The Saga is Antipattern

The Saga pattern is often positioned as a better way to handle distributed transactions. I see no point in discussing Saga's disadvantages because the problem is that Saga should not be used in the microservices at all:

If you need distributed transactions across a few microservices, most likely you incorrectly defined and separated domains.

Below is a long explanation why.

https://dev.to/siy/the-saga-is-antipattern-1354
Lost in transit: debugging dropped packets from negative header lengths

https://blog.cloudflare.com/lost-in-transit-debugging-dropped-packets-from-negative-header-lengths
Analyzing Volatile Memory on a Google Kubernetes Engine Node

TL:DR At Spotify, we run containerized workloads in production across our entire organization in five regions where our main production workloads are in Google Kubernetes Engine (GKE) on Google Cloud Platform (GCP). If we detect suspicious behavior in our workloads, we need to be able to quickly analyze it and determine if something malicious has happened. Today we leverage commercial solutions to monitor them, but we also do our own research to discover options and alternative methods.

One such research project led to the discovery of a new method for conducting memory analysis on GKE by combining three open source tools, AVML, dwarf2json, and Volatility 3, the result being a snapshot of all the processes and memory activities on a GKE node.

This new method empowers us and other organizations to use an open source alternative if we do not have a commercial solution in place or if we want to compare our current monitoring to the open source one.

In this blog post, I’ll explain in detail how memory analysis works and how this new method can be used on any GKE node in production today.

https://engineering.atspotify.com/2023/06/analyzing-volatile-memory-on-a-google-kubernetes-engine-node