trippy
https://github.com/fujiapple852/trippy
Trippy combines the functionality of traceroute and ping and is designed to assist with the analysis of networking issues.
https://github.com/fujiapple852/trippy
Prometheus metrics at 37signals
https://dev.37signals.com/prometheus-metrics-at-37signals
How we use Prometheus to ingest, store, and alert based on metrics.
https://dev.37signals.com/prometheus-metrics-at-37signals
37signals datacenter overview
https://dev.37signals.com/37signals-datacenter-overview
During our journey off the cloud, we’ve received a lot of questions about our datacenters. No, we do not run them on our own. I’m here to discuss at a high level what 37signals’ datacenter presence looks like.
https://dev.37signals.com/37signals-datacenter-overview
qryn
https://github.com/metrico/qryn
polyglot, lighweight, multi-standard drop-in observability framework for Logs, Metrics and Traces
https://github.com/metrico/qryn
PostgreSQL High-Availability Cluster
https://github.com/vitabaks/postgresql_cluster
Deploy a Production Ready PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
https://github.com/vitabaks/postgresql_cluster
walk
https://github.com/antonmedv/walk
Walk — a terminal navigator.
Why another terminal navigator? I wanted something simple and minimalistic. Something to help me with faster navigation in the filesystem; a cd and ls replacement. So I build walk. It allows for quick navigation with fuzzy searching, cd integration is quite simple. And you can open vim right from the walk. That's it.
https://github.com/antonmedv/walk
rot
https://github.com/candiddev/rot
Rot is an open source command line (CLI) tool for managing secrets.
https://github.com/candiddev/rot
pr-agent
https://github.com/Codium-ai/pr-agent
CodiumAI PR-Agent is an open-source tool for efficient pull request reviewing and handling.
https://github.com/Codium-ai/pr-agent
redb
https://github.com/cberner/redb
A simple, portable, high-performance, ACID, embedded key-value store.
https://github.com/cberner/redb
kube-job
https://github.com/h3poteto/kube-job
Run one off job on kubernetes from the command line tool
https://github.com/h3poteto/kube-job
A Glimpse into the Redesigned Goku-Ingestor vNext at Pinterest
https://medium.com/pinterest-engineering/a-glimpse-into-the-redesigned-goku-ingestor-vnext-at-pinterest-d68159473464
Better performance, lower cost and less code complexity
https://medium.com/pinterest-engineering/a-glimpse-into-the-redesigned-goku-ingestor-vnext-at-pinterest-d68159473464
Simplicity
https://commandcenter.blogspot.com/2023/12/simplicity.html
In May 2009, Google hosted an internal "Design Wizardry" panel, with talks by Jeff Dean, Mike Burrows, Paul Haahr, Alfred Spector, Bill Coughran, and myself. Here is a lightly edited transcript of my talk. Some of the details have aged out, but the themes live on, now perhaps more than ever.
https://commandcenter.blogspot.com/2023/12/simplicity.html
A deep dive into CPU requests and limits in Kubernetes
https://www.datadoghq.com/blog/kubernetes-cpu-requests-limits
In this post, we are going to dive a bit deeper into CPU and share some general recommendations for specifying CPU requests and limits. We will also explore the differences between using the default policy (CFS quota) and the CPU Manager’s static policy. We are not going to consider memory resources in this post.
https://www.datadoghq.com/blog/kubernetes-cpu-requests-limits
A Spooky Performance Regression in AWS EBS Volumes
https://www.dolthub.com/blog/2023-11-22-spooky-performance-regression-aws-ebs
Christmas Come Early: An AWS EBS Performance Regression Update
https://www.dolthub.com/blog/2023-12-08-christmas-come-early-ebs-performance-regression-update
https://www.dolthub.com/blog/2023-11-22-spooky-performance-regression-aws-ebs
Christmas Come Early: An AWS EBS Performance Regression Update
https://www.dolthub.com/blog/2023-12-08-christmas-come-early-ebs-performance-regression-update
Scaling SRE Teams
https://dzone.com/articles/scaling-sre-teams
Scaling teams of site reliability engineers comes with many challenges. Here, explore the challenges of scaling and review a successful scaling framework.
https://dzone.com/articles/scaling-sre-teams
Mastering AWS Lambda with Terraform: A Comprehensive Guide
https://blog.awsfundamentals.com/aws-lambda-with-terraform
https://blog.awsfundamentals.com/aws-lambda-with-terraform
VictoriaMetrics: A Comprehensive Guide, Comparing It to Prometheus, and Implementing Kubernetes Monitoring
https://medium.com/@seifeddinerajhi/victoriametrics-a-comprehensive-guide-comparing-it-to-prometheus-and-implementing-kubernetes-03eb8feb0cc2
https://medium.com/@seifeddinerajhi/victoriametrics-a-comprehensive-guide-comparing-it-to-prometheus-and-implementing-kubernetes-03eb8feb0cc2
Kubernetes And Kernel Panics
https://netflixtechblog.com/kubernetes-and-kernel-panics-ed620b9c6225
How Netflix’s Container Platform Connects Linux Kernel Panics to Kubernetes Pods
https://netflixtechblog.com/kubernetes-and-kernel-panics-ed620b9c6225
Kubewatch: A Kubernetes Watcher for Observability and Monitoring
https://medium.com/@seifeddinerajhi/kubewatch-a-kubernetes-watcher-for-observability-and-monitoring-d6dea1dbeb06
https://github.com/robusta-dev/kubewatch
Kubewatch is a Kubernetes watcher that publishes notifications to available collaboration hubs/notification channels. It watches the cluster for resource changes and notifies you through webhooks.
https://medium.com/@seifeddinerajhi/kubewatch-a-kubernetes-watcher-for-observability-and-monitoring-d6dea1dbeb06
https://github.com/robusta-dev/kubewatch