Kube Builders
1.58K subscribers
855 photos
182 videos
1.74K links
News and links on infrastructure and building Kubernetes clusters curated by the @Learnk8s team
Download Telegram
Forwarded from LearnKube news
This case study shows how OOM Killer terminated a critical network daemon on Kubernetes nodes, causing a network outage.

It covers debugging via serial console and implementing memory reservations to prevent system-critical process termination.

More: https://ku.bz/_TSW8pWsq
Cluster API is a Kubernetes subproject that provides declarative APIs and tooling to provision, upgrade, and operate Kubernetes clusters across infrastructure providers using Kubernetes-style automation patterns.

More: https://ku.bz/hWQpSM-pn
Forwarded from LearnKube news
This week on Learn Kubernetes Weekly 185:

🔥 A One-Line Kubernetes Fix That Saved 600 Hours a Year
🔐 Why Kubernetes Has No Login — And How We Solved It for AuditRadar
⚙️ Durable Workflows Beyond Vercel: Version-Safe Orchestration for Kubernetes
🧩 The Missing Layers in Your Kubernetes Operator
🚨 Why Your KServe InferenceService Won't Become Ready: Four Production Failures and Fixes

Read it now: https://kube.today/issues/185

⭐️ This issue is brought to you by Qodo, the AI code integrity platform helping teams review, test, and ship reliable infrastructure code faster https://ku.bz/NvLHsnl-6
This article explains how a team deployed Ansible AWX on K3s and extended it for OpenStack inventory, dynamic SSH users, execution nodes, custom execution environments, and air-gapped installs.

More: https://ku.bz/6Ms2R5RTk
Forwarded from KubeFM
Media is too big
VIEW IN TELEGRAM
GPU requests often run 2-3x higher than actual consumption in inference workloads. Why?

Andrew Hillier explains the core problem: inference is transactional, not batch. GPUs sit idle between requests, but you still have to size for peak load. Unlike CPUs with Linux schedulers filling utilization gaps, GPUs run monolithic models — what you allocate is what you get.

The fix? MIGs to partition GPUs, or time slicing for less critical workloads. Both help squeeze more value out of expensive hardware.



Watch the full interview: https://ku.bz/wL-0d1X0y
KubeSolo is a single-node Kubernetes distribution optimized for edge, IoT and embedded devices.

It eliminates clustering and etcd, uses SQLite via Kine, and runs in under 200MB RAM while remaining OCI-compliant and Helm-ready.

More: https://ku.bz/SPpVGdZ5Y
Forwarded from Kube Architect
Percona vs MongoDB Community vs KubeDB vs Atlas — which operator should you run for MongoDB on Kubernetes?

Full breakdown + architecture + PITR guide →
https://ku.bz/2n-smMsxC
Forwarded from KubeFM
This media is not supported in your browser
VIEW IN TELEGRAM
What does Stacey Potter have in store for KCD New York?

A practical conversation about why open-source security still creates too much cognitive load, why secure-by-design can't succeed without broad adoption, and how projects like SLSA and Sigstore help make security resources more useful and accessible—not just academically correct.

If you're interested in open-source security, software supply chain security, cloud-native infrastructure, platform engineering, and community-driven security practices, this session is a strong reason to get your ticket.

We also have 10 free tickets available—email hello@kube.events to grab one before they're gone.

🌎 https://ku.bz/JkjmffBzw
This tutorial shows how to migrate Amazon EKS VPC CNI from a self-managed DaemonSet to an AWS managed add-on by preserving custom env settings, moving permissions to IRSA, and avoiding downtime during adoption.

More: https://ku.bz/HLl9fhxc7
Forwarded from KubeFM
Media is too big
VIEW IN TELEGRAM
What is Udi Hofesh bringing to KCD New York?

Kubernetes was never easy, and AI workloads just turned the difficulty up to eleven. Udi will break down why operations are getting harder, where the cost pressure is coming from, and how AI SRE is a practical answer—not a buzzword.

We also have 10 free tickets available—email hello@kube.events to grab one.

KCD website: https://ku.bz/JkjmffBzw
Forwarded from Kubesploit
This tutorial shows how to run Cloudflare Tunnels as a DaemonSet to expose services with zero open inbound ports, using liveness probes, Kubernetes Secrets, and GitOps with ArgoCD.

More: https://ku.bz/RYlKnctWf
Forwarded from KubeFM
Media is too big
VIEW IN TELEGRAM
Mike Stefaniak, Head of Product, Kubernetes and Registries at Amazon Web Services (AWS), discusses the challenges of operating across multiple Kubernetes clusters and environments without requiring custom scripting or multiple kubeconfig files.

Mike outlines AWS's strategy to host the MCP server centrally, providing AWS with context for all clusters across accounts and regions. This architectural shift transforms troubleshooting from a single-cluster operation to fleet-wide visibility, eliminating the need for users to configure access to individual clusters manually.

Watch the full interview: https://ku.bz/PzjrglcZJ
Forwarded from LearnKube news
📣 New on LearnKube: "The mechanics of Kubernetes RBAC and how it connects users to permissions."

Kubernetes RBAC can feel confusing because the object names sound broader than the scope they actually grant.

A ClusterRole does not always mean cluster-wide access.

If you bind a ClusterRole with a RoleBinding, the permissions apply only in the namespace where the RoleBinding lives.

The article walks through:

- Why direct user-to-permission mappings do not scale
- how Roles and ClusterRoles group permissions into reusable sets
- how RoleBindings and ClusterRoleBindings connect identities to permissions
- How to test access with kubectl auth can-i

Read the full guide:
https://learnkube.com/rbac-kubernetes
Cyphernetes lets you query the Kubernetes API as if it were a graph database and discover relationships between resources.

More: https://ku.bz/5vrBXrCHN
CloudNativePG is the Kubernetes operator that covers the entire lifecycle of a highly available PostgreSQL database cluster with a primary/standby architecture, using native streaming replication.

More: https://ku.bz/n6gpgcYtf
Forwarded from LearnKube news
This week on Learn Kubernetes Weekly 186:

🔥 1 Million Tokens Per Second: Qwen 3.5 27B on GKE with B200 GPUs
🤖 How I Built Kernel: An AI-Powered IT Helpdesk That Deflects 80% of Support Tickets
⚙️ Ansible AWX: Infrastructure Automation on Top of Kubernetes
🛡️ I Setup Kubermatic SecureGuard Before It Even Existed
🔐 SRE: Secrets Management in Kubernetes

Read it now: https://kube.today/issues/186

⭐️ This newsletter is brought to you by StormForge by CloudBolt. Stop setting Kubernetes requests. Let ML handle rightsizing https://ku.bz/2wYKp0Q2Y
This article shows why Grafana becomes slow on Kubernetes when multiple replicas share SQLite over EFS, and explains why a single replica on block storage or a real external database is the correct fix.

More: https://ku.bz/JGj7gl5wt
Forwarded from KubeFM
Media is too big
VIEW IN TELEGRAM
Billy Thompson, DevOps Platform Engineering - Office of the CTO @ Akamai, discusses the strategic decision between building custom Kubernetes tools versus adopting existing CNCF projects.

The discussion provides a practical framework for evaluating time investment, maintenance capacity, and the broader impact of tooling decisions in Kubernetes environments.

Watch the full interview: https://ku.bz/Jk2xSwXHp

This interview is a reaction to Alessandro Pomponio's episode https://ku.bz/5sK7BFZ-8
Forwarded from LearnKube news
Pumba lets you kill, pause, and stress containers while injecting network delays, packet loss, and corruption.

You can deploy it as a DaemonSet for cluster-wide chaos engineering.

More: https://ku.bz/K7_RB9tSq
Kubectl OpenAI plugin is a kubectl plugin to generate and apply Kubernetes manifests using OpenAI GPT.

More: https://ku.bz/fxBdsk7Kf