How we tested scaling to 10,000 Kubernetes clusters without missing a beat
https://www.spectrocloud.com/blog/how-we-tested-scaling-to-10-000-kubernetes-clusters-without-missing-a-beat
https://www.spectrocloud.com/blog/how-we-tested-scaling-to-10-000-kubernetes-clusters-without-missing-a-beat
kro
https://github.com/kro-run/kro
This project aims to simplify the creation and management of complex custom resources for Kubernetes.
Kube Resource Orchestrator (kro) helps you to define complex multi-resource constructs as reusable components in your applications and systems. It does this by providing a Kubernetes-native, vendor agnostic way to define groupings of Kubernetes resources.
kro's fundamental custom resource is the ResourceGraphDefinition. A ResourceGraphDefinition defines collections of underlying Kubernetes resources. It can define any Kubernetes resources, either native or custom, and can specify the dependencies between them. This lets you define complex custom resources, and include default configurations for their use.
The kro controller will determine the dependencies between resources, establish the correct order of operations to create and configure them, and then dynamically create and manage all of the underlying resources for you.
kro is Kubernetes native and integrates seamlessly with existing tools to preserve familiar processes and interfaces.
https://github.com/kro-run/kro
The Hidden Risk of Running WordPress on Kubernetes: Debugging an Unexpected Downtime Issue
https://medium.com/1000farmacie/the-hidden-risk-of-running-wordpress-on-kubernetes-debugging-an-unexpected-downtime-issue-e810bf4fb577
https://medium.com/1000farmacie/the-hidden-risk-of-running-wordpress-on-kubernetes-debugging-an-unexpected-downtime-issue-e810bf4fb577
Understanding the 1MB Limit of Etcd in Kubernetes: Challenges with Helm Deployments
https://logeshbalu1998.medium.com/understanding-the-1mb-limit-of-etcd-in-kubernetes-challenges-with-helm-deployments-47ef41f37e9c
https://logeshbalu1998.medium.com/understanding-the-1mb-limit-of-etcd-in-kubernetes-challenges-with-helm-deployments-47ef41f37e9c
kubewall
https://github.com/kubewall/kubewall
A single binary to manage your multiple kubernetes clusters.
kubewall provides a simple and rich real time interface to manage and investigate your clusters.
https://github.com/kubewall/kubewall
Terratags: Enforce Tags on your AWS Terraform configuration
https://dev.to/quixoticmonk/terratags-enforce-tags-on-your-aws-terraform-configuration-1ck5
https://dev.to/quixoticmonk/terratags-enforce-tags-on-your-aws-terraform-configuration-1ck5
Azure Verified Module - Azure Landing Zones
P1: https://mikeguy.co.uk/posts/azure-verified-module-landing-zones-part-1
P2: https://mikeguy.co.uk/posts/azure-verified-module-landing-zones-part-2
In this article, we take a look at the Azure Verified Module for Azure Landing Zones, and how we can customise deployments.
P1: https://mikeguy.co.uk/posts/azure-verified-module-landing-zones-part-1
P2: https://mikeguy.co.uk/posts/azure-verified-module-landing-zones-part-2
What I Really Mean When I Say “Good Communication” in Incident Response
https://uptimelabs.io/good-communication-in-incident-response
https://uptimelabs.io/good-communication-in-incident-response
As a Seasoned K8s Expert: An In-Depth Analysis of the OpenAI’s Incident and Mitigation Strategies
https://midbai.com/en/post/how-to-avoid-openai-incident
On December 11, 2024, OpenAI experienced a major outage caused by a failure in the Kubernetes cluster control plane. For outsiders, this may simply seem like an interesting incident, but as an insider, I analyzed this failure from a technical perspective.
https://midbai.com/en/post/how-to-avoid-openai-incident
Taming the Wild West of Research Computing: How Policies Saved Us a Thousand Headaches
https://alessandropomponio.medium.com/taming-the-wild-west-of-research-computing-how-policies-saved-us-a-thousand-headaches-9432558f5740
Harnessing the power of policy-driven governance in shared computing environments
https://alessandropomponio.medium.com/taming-the-wild-west-of-research-computing-how-policies-saved-us-a-thousand-headaches-9432558f5740
We’re leaving Kubernetes
https://www.gitpod.io/blog/we-are-leaving-kubernetes
Kubernetes seems like the obvious choice for building out remote, standardized and automated development environments. We thought so too and have spent six years invested in making the most popular cloud development environment platform at internet scale. That’s 1.5 million users, where we regularly see thousands of development environments per day. In that time, we’ve found that Kubernetes is not the right choice for building development environments.
https://www.gitpod.io/blog/we-are-leaving-kubernetes
Reducing Pod Startup Time for Java Application on EKS
https://medium.com/@balu8095/reducing-pod-startup-time-for-java-application-on-eks-a4fc80482039
https://medium.com/@balu8095/reducing-pod-startup-time-for-java-application-on-eks-a4fc80482039
How It Works — Validating Admission Policy
https://ihcsim.medium.com/how-it-works-validating-admission-policy-0664d23ce230
https://ihcsim.medium.com/how-it-works-validating-admission-policy-0664d23ce230
Istio-Proxy Chaos in the Middle of a Snowy Morning
https://medium.com/@zehendiaries/istio-proxy-chaos-in-the-middle-of-a-snowy-morning-6fe437cf3996
December 4th, 2024, started as a peaceful, snowy morning. Around 8 AM, I settled into my work-from-home routine, having freshly brewed coffee. My usual workflow:
1. Check the Production dashboard to ensure everything is running smoothly — and it was.
2. Check my email and Slack to see if any team member needs help.
3. Open JIRA, pick up a task and get ready to dive into work.
There were no pressing issues to address. I opened JIRA and picked up a task to migrate one of our Infrastructures as a Code repository from Terragrunt to Terraform. This is a topic for another post to explain why.
Lucked out! The peace and serenity didn’t last long. An alert popped up: One of our production services had gone down. What started as a calm Wednesday morning quickly turned into a troubleshooting adventure.
https://medium.com/@zehendiaries/istio-proxy-chaos-in-the-middle-of-a-snowy-morning-6fe437cf3996
ETCD Production setup with TLS
https://blog.mohsen.co/etcd-production-setup-with-openssl-2b9ecd7e00d5
https://blog.mohsen.co/etcd-production-setup-with-openssl-2b9ecd7e00d5
Mastering Compute Efficiency: Dynamic GPU Partitioning Strategies for Kubernetes-Based ML Systems
https://yashmehra2411.medium.com/mastering-gpu-efficiency-dynamic-partitioning-strategies-for-kubernetes-based-ml-systems-75100c94112b
https://yashmehra2411.medium.com/mastering-gpu-efficiency-dynamic-partitioning-strategies-for-kubernetes-based-ml-systems-75100c94112b
Standardizing App Delivery with Flux and Generic Helm Charts
https://medium.com/@stefanprodan/standardizing-app-delivery-with-flux-and-generic-helm-charts-f66941f399e9
In this guide we will explore how Flux can be used to standardize the lifecycle management of applications by leveraging the Generic Helm Chart pattern.
The big promise of this pattern is that it should reduce the cognitive load on developers, as they only need to focus on the service-specific configuration, while the Generic Helm Chart shields them from the complexity of the Kubernetes API.
https://medium.com/@stefanprodan/standardizing-app-delivery-with-flux-and-generic-helm-charts-f66941f399e9
khronoscope
https://github.com/hoyle1974/khronoscope
Khronoscope is a tool inspired by k9s that allows you to inspect the state of your Kubernetes cluster and travel back in time to see its state at any point since you started the application using VCR like controls.
https://github.com/hoyle1974/khronoscope
kubectl-switch
https://github.com/mirceanton/kubectl-switch
kubectl-switch is a command-line tool for managing and switching between multiple Kubernetes configuration files located in the same directory. It simplifies the process of selecting a Kubernetes context from multiple kubeconfig files and updating the active configuration or namespace.
Just dump all your kubeconfigs into a single dir and let kubectl-switch manage them for you!
https://github.com/mirceanton/kubectl-switch