DevOps&SRE Library

The Service Mesh Landscape

https://layer5.io/service-mesh-landscape

3.04K views16:01

milvus

A cloud-native vector database, storage for next generation AI applications

https://github.com/milvus-io/milvus

2.99K views07:01

DevOps&SRE Library

Real-World GitOps with Flux, Flagger, and Linkerd

https://linkerd.io/2023/05/15/real-world-gitops

3.03K views16:00

DevOps&SRE Library

How to create a cron job docker container using AWS ECS, Fargate, fully automated with Terraform

https://noiselesstech.net/2023/05/04/how-to-create-a-cron-job-docker-container-using-aws-ecs-fargate-fully-automated-with-terraform

3.49K views07:00

DevOps&SRE Library

Terraform automation for teams. Purpose-built for GitHub

Start running Terraform with cost estimation, security alerts, drift detection, access controls, and OPA policy testing. All within GitHub's UI.

https://terrateam.io

3.18K views16:01

DevOps&SRE Library

Introduction to Terraform on Google Cloud: Solutions, benefits, resources, and FAQs

https://www.googlecloudcommunity.com/gc/Community-Blogs/Introduction-to-Terraform-on-Google-Cloud-Solutions-benefits/ba-p/550474

3.07K views07:01

DevOps&SRE Library

terrap-cli

Terrap - a powerful CLI tool that scans your infrastructure and identifies any required changes.

https://github.com/sirrend/terrap-cli

3.11K views16:01

DevOps&SRE Library

tfc-workflows-github

This repo includes prescriptive workflows that implement best practices when interacting with Terraform Cloud. These starter workflow templates provide a entrypoint to integrate your CI/CD pipelines with Terraform Cloud.

https://github.com/hashicorp/tfc-workflows-github

3.06K views07:01

DevOps&SRE Library

IaC & GitOps with EKS blueprints

TLDR; Need a cluster up and running fast? Take a close look at eks-blueprints, I got started in minutes and have been working with it for almost 2 years now.

https://medium.com/everything-full-stack/iac-gitops-with-eks-blueprints-7a28ad1f702a

3.14K views16:00

DevOps&SRE Library

Percentiles don’t work: Analyzing the distribution of response times for web services

https://adrianco.medium.com/percentiles-dont-work-analyzing-the-distribution-of-response-times-for-web-services-ace36a6a2a19

3.02K views07:00

DevOps&SRE Library

etcd: getting 30% more write/s

My team at Zendesk look after around 30 Kubernetes clusters. These are all self managed, meaning we maintain the API servers, and as you may guess: etcd.

Recently, I had a task to do some performance analysis on our etcd clusters. It had been a while since we ran any sort of benchmarking. Plus I wanted to get my hands dirty as I haven’t got much experience in tuning databases.

While I ended up getting about a 30% increase in performance, I learnt a lot about how databases, and by extension; how disks work together.

https://zendesk.engineering/etcd-getting-30-more-write-s-318bcdbf7774

3.18K views16:00

DevOps&SRE Library

How to Monitor CoreDNS

CoreDNS is a DNS add-on for Kubernetes environments. It is one of the components running in the control plane nodes, and having it fully operational and responsive is key for the proper functioning of Kubernetes clusters. Learning how to monitor CoreDNS, and what its most important metrics are, is a must for operations teams.

https://sysdig.com/blog/how-to-monitor-coredns

3.14K views07:00

DevOps&SRE Library

Managing Grafana Dashboards With Terraform

We’ve all done it — deleted a graph from a dashboard, realised we still need it but have forgotten the query. Use Terraform to go back in time and save yourself the headache

https://betterprogramming.pub/managing-grafana-dashboards-with-terraform-ad49ff6bb552

3.08K views16:00

DevOps&SRE Library

The DevOps Hangover

The greatest irony is that DevOps aimed to help developers talk with operations, and vice versa, yet developers are even further away from understanding how their code operates at runtime.

https://www.linkedin.com/pulse/devops-hangover-pete-cheslock

3.1K views07:01

DevOps&SRE Library

Whose Cert Is It Anyway?

https://www.netmeister.org/blog/caa-diversity.html

3.14K views16:01

DevOps&SRE Library

Migrating Terraform state from Terraform Cloud to S3

https://blog.marcolancini.it/2023/blog-migrate-terraform-state-from-terraform-cloud-to-s3

2.99K views07:00

DevOps&SRE Library

Moving Terraform Managed Resources Between States for Scaling AWS Infrastructure in Startups

https://fivexl.io/blog/terraform-mv

2.92K views16:01

DevOps&SRE Library

Terraform check{} Block

https://unfriendlygrinch.info/posts/terraform-check-block

3.01K views07:01

DevOps&SRE Library

Why `fsync()`: Losing unsynced data on a single node leads to global data loss

Regardless of the replication mechanism you must fsync() your data to prevent global data loss in non-Byzantine protocols.

https://redpanda.com/blog/why-fsync-is-needed-for-data-safety-in-kafka-or-non-byzantine-protocols

2.93K views16:00

DevOps&SRE Library

Fleet Management at Spotify

Part 1: Spotify’s Shift to a Fleet-First Mindset - https://engineering.atspotify.com/2023/04/spotifys-shift-to-a-fleet-first-mindset-part-1

Part 2: The Path to Declarative Infrastructure - https://engineering.atspotify.com/2023/05/fleet-management-at-spotify-part-2-the-path-to-declarative-infrastructure

Part 3: Fleet-wide Refactoring - https://engineering.atspotify.com/2023/05/fleet-management-at-spotify-part-3-fleet-wide-refactoring

3.33K views07:00

About

Blog

Apps

Platform