Supporting Teams with Different Maturity Levels
https://medium.com/@hans.knechtions/supporting-teams-with-different-maturity-levels-c43f5b5080eb
https://medium.com/@hans.knechtions/supporting-teams-with-different-maturity-levels-c43f5b5080eb
sre-checklist
A checklist of anyone practicing Site Reliability Engineeringhttps://github.com/bregman-arie/sre-checklist
Why bother with SLI and SLO?
Is there really any value in setting service level indicators and objectives?https://blog.alexewerlof.com/p/why-bother-with-sli-and-slo
Traffic Jams in the Cloud: Are Overloads Sabotaging Your Application's Reliability?
https://blog.fluxninja.com/blog/traffic-jams-in-the-cloud-unveiling-the-true-enemy-of-reliability
https://blog.fluxninja.com/blog/traffic-jams-in-the-cloud-unveiling-the-true-enemy-of-reliability
PostgreSQL: No More VACUUM, No More Bloat
PostgreSQL, a powerful open-source object-relational database system, has been lauded for its robustness, functionality, and flexibility. However, it is not without its challenges – one of which is the notorious VACUUM process. However, the dawn of a new era is upon us with OrioleDB, a novel engine designed for PostgreSQL that promises to eliminate the need for the resource-consuming VACUUM.https://www.orioledata.com/blog/no-more-vacuum-in-postgresql
Identifying GCP’s Hidden Network Inter-Zone Egress Costs
Learn how to identify your Inter-Zone Egress costs in a few easy steps, using commonly available methods.https://www.doit.com/identifying-gcps-hidden-network-inter-zone-egress-costs
Ever wondered where those Inter-Zone Egress costs are coming from? Found yourself looking at GCP’s network pricing page many times to break it down? Me too. So I thought I might as well try to help clear things up.
faasd
faasd is OpenFaaS reimagined, but without the cost and complexity of Kubernetes. It runs on a single host with very modest requirements, making it fast and easy to manage. Under the hood it uses containerd and Container Networking Interface (CNI) along with the same core OpenFaaS components from the main project.https://github.com/openfaas/faasd
blazingmq
BlazingMQ is an open source distributed message queueing framework, which focuses on efficiency, reliability, and a rich feature set for modern-day workflows.https://github.com/bloomberg/blazingmq
At its core, BlazingMQ provides durable, fault-tolerant, highly performant, and highly available queues, along with features like various message routing strategies (e.g., work queues, priority, fan-out, broadcast, etc.), compression, strong consistency, poison pill detection, etc.
Scaling Terraform with Terramate
In CWISE we use Terraform a lot. The most common use cases for Terraform for us is cloud resource provisioning, Kubernetes configuration management, and SaaS services (like Github/Gitlab) management.https://www.cwise.eu/post/scaling-terraform-with-terramate
We prefer Terraform over many other competitors due multiple reasons:
- Tried and tested tool, has been around for a long time and Hashicorp is doing great work of developing it. Can be defined as mature and even boring technology;
- A large number of community resources like providers, modules, and documentation;
- Good developer experience due to support in IDE's and support tools;
- Has got a configuration state (database);
Optimizing AWS Infrastructure: Leveraging Terraform for Low Coupling and High Cohesion
https://medium.com/@itsnarayan/optimizing-aws-infrastructure-leveraging-terraform-for-low-coupling-and-high-cohesion-a5ae6049ab1e
https://medium.com/@itsnarayan/optimizing-aws-infrastructure-leveraging-terraform-for-low-coupling-and-high-cohesion-a5ae6049ab1e
Ultimate Guide to Passing the Terraform Exam
https://www.linkedin.com/pulse/ultimate-guide-passing-terraform-exam-mesut-oezdil
https://www.linkedin.com/pulse/ultimate-guide-passing-terraform-exam-mesut-oezdil
terraform-tui
TFTUI is a powerful textual GUI that empowers users to effortlessly view and interact with their Terraform state.https://github.com/idoavrah/terraform-tui
With its latest version you can easily visualize the complete state tree, gaining deeper insights into your infrastructure's current configuration. Additionally, the ability to inspect individual resource states allows you to focus on specific details for better analysis and management. Lastly, it's now possible to select resources and perform actions such as tainting and untainting.
Building a Successful SRE Team
Successful techniques to ensure your SRE team delivers valuehttps://medium.com/@hans.knechtions/building-a-successful-sre-team-283232bc2694
Lesson learned while scaling Kubernetes cluster to 1000 pods in AWS EKS
https://devopslearning.medium.com/lesson-learned-while-scaling-kubernetes-cluster-to-1000-pods-in-aws-eks-d2d399152bc2
https://devopslearning.medium.com/lesson-learned-while-scaling-kubernetes-cluster-to-1000-pods-in-aws-eks-d2d399152bc2
How to avoid global outage — Seamlessly migrating DaemonSet labels
As Site Reliability Engineering Team, we continuously strive to improve the systems we operate. One way to do so is to stay up-to-date with upstream components. One of the components that needed some special care turned out to be a CSI Driver, which is installed in the Kubernetes cluster as DaemonSet. Originally, the driver was installed in the cluster using YAML manifest and kubectl. As the dev team moved to support Helm, we also wanted to utilize Helm Chart for the driver to ease our lives.https://engineering.prezi.com/intro-4727024fc2c1
Comparing Kubernetes operators for PostgreSQL. Part 2: CloudNativePG
https://blog.palark.com/cloudnativepg-and-other-kubernetes-operators-for-postgresql
https://blog.palark.com/cloudnativepg-and-other-kubernetes-operators-for-postgresql
Container Security Site
This is a site with some container security resources. It is (and probably always will be) a work in progress, but hopefully you’ll find some useful information.https://www.container-security.site