Auto-scaling and Load-based Scaling
https://blog.felipefr.dev/auto-scaling-and-load-based-scaling
Explains reactive metric-based scaling versus scheduled scaling and where each approach fits.
https://blog.felipefr.dev/auto-scaling-and-load-based-scaling
rtk
https://github.com/rtk-ai/rtk
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
https://github.com/rtk-ai/rtk
Integration testing with Kubernetes
https://mikamu.substack.com/p/integration-testing-with-kubernetes
Shows a Rust-based integration testing workflow on kind with Terraform and cleanup policies for parallel runs.
https://mikamu.substack.com/p/integration-testing-with-kubernetes
Vault: secure Kubernetes authentication with hashicorp Vault OIDC
https://phuchoang.sbs/posts/gitops-kubernetes-oidc-vault
Explains how to use Vault as an OIDC provider to replace static kubeconfig credentials with short-lived tokens.
https://phuchoang.sbs/posts/gitops-kubernetes-oidc-vault
Security Inside Kubernetes: Admission & Runtime Guardrails with Kyverno and KubeArmor
https://medium.com/globant/security-inside-kubernetes-admission-runtime-guardrails-with-kyverno-and-kubearmor-6d2f97264cbc
Covers layered Kubernetes security by combining Kyverno admission policies with KubeArmor runtime enforcement.
https://medium.com/globant/security-inside-kubernetes-admission-runtime-guardrails-with-kyverno-and-kubearmor-6d2f97264cbc
Crust-Gather - kubectl Cluster Snapshot Plugin
https://github.com/crust-gather/crust-gather
Open-source kubectl plugin for collecting a structured cluster snapshot for debugging and analysis.
https://github.com/crust-gather/crust-gather
Kogaro - Kubernetes Configuration Hygiene Agent
https://github.com/topiaruss/kogaro
Agent project focused on improving Kubernetes configuration hygiene and reducing misconfiguration risk.
https://github.com/topiaruss/kogaro
llm-d: SOTA inference performance
https://github.com/llm-d/llm-d
Project targeting high-performance large language model inference workloads.
https://github.com/llm-d/llm-d
Kthena: Enterprise LLM serving
https://github.com/volcano-sh/kthena
Enterprise-oriented platform for serving and operating LLM workloads on Kubernetes.
https://github.com/volcano-sh/kthena
Easykube: Local Kubernetes development
https://github.com/torloejborg/easykube
Tooling aimed at simplifying local Kubernetes development environments.
https://github.com/torloejborg/easykube
Guardon: Kubernetes security extension
https://github.com/guardon-dev/guardon
Security-focused extension project for strengthening Kubernetes environments.
https://github.com/guardon-dev/guardon
We Cut Our Kubernetes Pods by 60% and Doubled Traffic Capacity
https://medium.com/@feridquluzade2002/we-cut-our-kubernetes-pods-by-60-and-doubled-traffic-capacity-b1cfb6850fca
This case study explains how JVM tuning, a smaller Hikari pool, and faster HPA scale-up doubled traffic capacity while reducing baseline pods.
https://medium.com/@feridquluzade2002/we-cut-our-kubernetes-pods-by-60-and-doubled-traffic-capacity-b1cfb6850fca
Hidden Kubernetes Bad Practices Learned the Hard Way During Incidents
https://hackernoon.com/hidden-kubernetes-bad-practices-learned-the-hard-way-during-incidents
This article distills incident-driven lessons on troubleshooting, configuration mistakes, and operational habits that make Kubernetes outages worse.
https://hackernoon.com/hidden-kubernetes-bad-practices-learned-the-hard-way-during-incidents
From Chaos to 99.9% Uptime: Rebuilding a Kubernetes Platform for GPU Workloads
https://medium.com/@mateenali66/from-chaos-to-99-9-uptime-rebuilding-a-kubernetes-platform-for-gpu-workloads-4fadb1067a0b
This article covers rebuilding a Kubernetes platform for GPU workloads to reach 99.9% uptime after operational instability.
https://medium.com/@mateenali66/from-chaos-to-99-9-uptime-rebuilding-a-kubernetes-platform-for-gpu-workloads-4fadb1067a0b
Benchmarking Kubernetes Log Collectors: vlagent, Vector, Fluent Bit, OpenTelemetry Collector, and more
https://victoriametrics.com/blog/log-collectors-benchmark-2026/index.html
At VictoriaMetrics, we built vlagent as a high-performance log collector for VictoriaLogs. To validate its performance and correctness under a real production-like load, we developed a benchmark suite and ran it against 8 popular log collectors. This post covers the methodology, throughput results, resource usage, and delivery correctness.
https://victoriametrics.com/blog/log-collectors-benchmark-2026/index.html
Making and scaling a game server in Kubernetes using agones
https://noe-t.dev/posts/making-and-scaling-a-game-server-in-k8s-using-agones
This tutorial walks through building a Go game server with Agones, matchmaking, Fleet allocation, and autoscaling on Kubernetes.
https://noe-t.dev/posts/making-and-scaling-a-game-server-in-k8s-using-agones
PostgreSQL migration with CloudNativePG Logical Replication on Kubernetes - Zero-Downtime
https://kndoni.medium.com/postgresql-migration-with-cloudnativepg-logical-replication-on-kubernetes-zero-downtime-aef1c33a3a53
This tutorial shows how to migrate PostgreSQL to CloudNativePG on Kubernetes with logical replication and no downtime.
https://kndoni.medium.com/postgresql-migration-with-cloudnativepg-logical-replication-on-kubernetes-zero-downtime-aef1c33a3a53
Gateway API setup on GKE with NGINX Gateway Fabric
https://medium.com/@henrikamirbekyan/gateway-api-setup-on-gke-with-nginx-gateway-fabric-1b0d0ec3bbf3
This tutorial shows how to deploy NGINX Gateway Fabric on GKE with Terraform, split traffic paths, and automate TLS certificates.
https://medium.com/@henrikamirbekyan/gateway-api-setup-on-gke-with-nginx-gateway-fabric-1b0d0ec3bbf3
Migrating Kubernetes Off Big Cloud
https://kube.fm/migrating-kubernetes-off-big-cloud-fernando
This interview compares the cost and operational tradeoffs of moving a Kubernetes workload from GKE Autopilot to Hetzner with Edka.
https://kube.fm/migrating-kubernetes-off-big-cloud-fernando
GoKubeDownscaler
https://github.com/caas-team/GoKubeDownscaler
A horizontal autoscaler for Kubernetes workloads, saving cloud costs by scaling workloads down after hours. This is a golang port and successor of the popular (py-)kube-downscaler with improvements and quality of life changes.
https://github.com/caas-team/GoKubeDownscaler