DevOps&SRE Library

KubeSkoop is a kubernetes networking diagnose tool for different CNI plug-ins and IAAS providers. KubeSkoop automatic construct network traffic graph of Pod in the Kubernetes cluster, monitoring and analysis of the kernel's critical path by eBPF, to resolve most of Kubernetes cluster network problems.

https://github.com/alibaba/kubeskoop

4.32K views07:00

DevOps&SRE Library

kubezoo

KubeZoo is a lightweight gateway service that leverages the existing namespace model and add multi-tenancy capability to existing Kubernetes. KubeZoo provides view-level isolation among tenants by capturing and transforming the requests and responses.

https://github.com/kubewharf/kubezoo

3.93K views15:00

DevOps&SRE Library

kargo

Kargo is a next-generation continuous delivery and application lifecycle orchestration platform for Kubernetes. It builds upon GitOps principles and integrates with existing technologies, like Argo CD, to streamline and automate the progressive rollout of changes across the many stages of an application's lifecycle.

https://github.com/akuity/kargo

3.65K views07:00

DevOps&SRE Library

wireguard-ui

A web user interface to manage your WireGuard setup.

https://github.com/ngoduykhanh/wireguard-ui

3.61K views15:00

DevOps&SRE Library

Setting Java Heap Size Inside a Docker Container

https://medium.com/nordnet-tech/setting-java-heap-size-inside-a-docker-container-b5a4d06d2f46

3.49K views06:59

DevOps&SRE Library

Troubleshooting Missing Kubernetes Logs in Elasticsearch

https://povilasv.me/troubleshooting-missing-kubernetes-logs-in-elasticsearch

3.55K views14:59

DevOps&SRE Library

Optimizing Kubernetes scalability and cost-efficiency with Karpenter

In this post, you’ll learn the rationale and approach taken by Miro’s Compute team to enhance Kubernetes cluster scaling and efficiency. This was achieved by adopting groupless node pools using Karpenter and helped reduce the compute costs in non-production clusters up to 60%, while increasing production resources usage efficiency up to 95%.

https://medium.com/miro-engineering/optimizing-kubernetes-scalability-and-cost-efficiency-with-karpenter-356153fcf546

3.64K views06:59

DevOps&SRE Library

Handling Pods When Nodes Fail

In addition to the basic Pod types, Kubernetes offers a variety of higher-level workload types, such as Deployment, DaemonSet, and StatefulSet. These higher-level controllers allow you to provide services with multiple replicas of your Pods, making it easier to achieve a high availability architecture.

However, when Kubernetes nodes experience failures such as crashes, network disruptions, or system failures, what happens to the Pods running on those nodes?

From a high availability perspective, some might think that having multiple replicas of an application ensures that the service remains unaffected by node failures. However, in certain cases where the application belongs to a StatefulSet, horizontal scaling isn’t an option. In such scenarios, it becomes necessary to quickly reschedule the related Pods to maintain service availability in the event of node failures.

https://hwchiu.medium.com/handling-pods-when-nodes-fail-4daae20213b

3.85K views14:59

DevOps&SRE Library

Kubernetes V1.27 : Safeguarding Pod with MemoryThrottlingFactor

https://faun.pub/kubernetes-v1-27-safeguarding-pod-with-memorythrottlingfactor-cfbccde10de

3.46K views06:59

DevOps&SRE Library

dcgm-exporter

This repository contains the DCGM-Exporter project. It exposes GPU metrics exporter for Prometheus leveraging NVIDIA DCGM.

https://github.com/NVIDIA/dcgm-exporter

3.81K views15:01

About

Blog

Apps

Platform