DevOps&SRE Library

PgBouncer on Kubernetes and how to achieve minimal latency

Experiments with connection poolers on Kubernetes for Postgres Operator

https://engineering.zalando.com/posts/2020/06/postgresql-connection-poolers.html

3.19K views16:00

DevOps&SRE Library

Unthrottled: Fixing CPU Limits in the Cloud

This year, my teammates and I solved a CPU throttling issue that affects nearly every container orchestrator with hard limits, including Kubernetes, Docker, and Mesos. In doing so, we lowered worst-case response latency in one of Indeed’s applications from over two seconds to 30 milliseconds. In this two-part series, I’ll explain our journey to find the root cause and how we ultimately arrived at the solution.

Part 1: https://medium.com/indeed-engineering/unthrottled-fixing-cpu-limits-in-the-cloud-a0995ede8e89

Part 2: https://medium.com/indeed-engineering/unthrottled-how-a-valid-fix-becomes-a-regression-f61eabb2fbd9

3.33K views18:00

DevOps&SRE Library

KubeDB by AppsCode

KubeDB by AppsCode is a production-grade cloud-native database management solution for Kubernetes. KubeDB simplifies and automates routine database tasks such as provisioning, patching, backup, recovery, failure detection, and repair for various popular databases on private and public clouds. It frees you to focus on your applications so you can give them the fast performance, high availability, security and compatibility they need.

https://github.com/kubedb/operator

3.15K views07:00

DevOps&SRE Library

How SLIs Help You Understand Users' Needs

https://www.blameless.com/blog/slis-understand-users-needs

3.86K views12:00

DevOps&SRE Library

NAMING APPLICATIONS AND MICROSERVICES

https://srcco.de/posts/naming-applications-components-microservices.html

3.11K views15:59

DevOps&SRE Library

Code review checklist for distributed systems

- Define a path for error handling
- Have a plan for recovery
- Always set timeouts on remote system calls
- Retry on timeout
- Use circuit breaker
- Don't handle timeouts like a failure
- Don't invoke remote systems inside transactions
- Use smart batching
- All APIs MUST be idempotent
- Define response time and throughput SLAs explicitly and code to adhere to them
- Define and limit batch APIs
- Think about Observability up-front
- Cache aggressively
- Consider unit of failure
- Isolate external domain objects at the edge of the system
- Sanitize input at every edge
- Never commit credentials

https://www.kislayverma.com/post/code-review-checklist-for-distributed-systems

5.3K views07:00

DevOps&SRE Library

Vector

A lightweight and ultra-fast tool for building observability pipelines

https://github.com/timberio/vector

4.04K views16:00

DevOps&SRE Library

ConfigMaps in Kubernetes: how they work and what you should remember

https://medium.com/flant-com/configmaps-in-kubernetes-f9f6d0081dcb

3.17K views06:59

DevOps&SRE Library

I Found A Painless Way To Manage Secrets In Google Kubernetes Engine

https://hackernoon.com/i-found-a-painless-way-to-manage-secrets-in-google-kubernetes-engine-cs3d3uuz

3.25K views06:59

DevOps&SRE Library

How to measure Linux Performance Avoiding Most Typical Mistakes

CPU: https://ma.ttias.be/how-to-measure-linux-performance-avoiding-most-typical-mistakes-cpu

Disk: https://ma.ttias.be/how-to-measure-linux-performance-avoiding-most-typical-mistakes-disk-storage

Memory: https://ma.ttias.be/how-to-measure-linux-performance-avoiding-most-typical-mistakes-memory

Network: https://ma.ttias.be/how-to-measure-linux-performance-avoiding-most-typical-mistakes-network

4.45K views07:00

DevOps&SRE Library

Design review checklist for Distributed Systems

https://www.kislayverma.com/post/design-review-checklist-for-distributed-systems

3.14K views16:00

DevOps&SRE Library

Docker and Kubernetes — root vs. privileged

https://itnext.io/docker-and-kubernetes-root-vs-privileged-9d2a37453dec

3.12K views06:59

DevOps&SRE Library

Presslabs is the First Managed WordPress Hosting Platform running on Kubernetes

https://www.presslabs.com/blog/presslabs-is-the-first-managed-wordpress-hosting-platform-running-on-kubernetes

3.02K views12:00

DevOps&SRE Library

Verify your Kubernetes Cluster Network Policies: From Faith to Proof

https://blog.nody.cc/posts/2020-06-kubernetes-network-policy-verification

3.02K views15:59

DevOps&SRE Library

Install a Kubernetes load balancer on your Raspberry Pi homelab with MetalLB

https://opensource.com/article/20/7/homelab-metallb

2.96K views06:00

DevOps&SRE Library

Introducing Frigate

A documentation generation tool for Kubernetes Helm Charts

https://medium.com/rapids-ai/introducing-frigate-a-documentation-generation-tool-for-kubernetes-1791854031a1

2.92K views07:00

DevOps&SRE Library

Towards More Effective Incident Postmortems

https://www.squadcast.com/blog/towards-more-effective-incident-postmortems

2.9K views05:00

DevOps&SRE Library

The Building Blocks of DX: K8s Evolution from CLI to GitOps

https://medium.com/@kgamanji/the-building-blocks-of-dx-k8s-evolution-from-cli-to-gitops-a7a574ac10eb

2.86K views06:00

DevOps&SRE Library

Improving Incident Retrospectives at Indeed

https://www.learningfromincidents.io/blog/improving-incident-retrospectives-at-indeed

2.82K views07:00

DevOps&SRE Library

Minimum Viable Kubernetes

So just for fun, let's see what the absolute bare minimum "Kubernetes cluster" actually looks like.

https://eevans.co/blog/minimum-viable-kubernetes

2.95K views07:30

About

Blog

Apps

Platform