An article by Charity Majors on why thinking of Observability in pillars is limiting.
I recall a similar article from the past about how Facebook does their observability. It’s somewhere here on the channel.
The core idea is to treat all the signals as universal wide events that would allow one to preserve all the context and not hop between different tools.
#observability
I recall a similar article from the past about how Facebook does their observability. It’s somewhere here on the channel.
The core idea is to treat all the signals as universal wide events that would allow one to preserve all the context and not hop between different tools.
#observability
charity.wtf
How many pillars of observability can you fit on the head of a pin?
My day started off with an innocent question, from an innocent soul. “Hey Charity, is profiling a pillar?” I hadn’t even had my coffee yet. “Someone was just telling me that profiling is the fourth…
👍8🤯1
For today's Donations Monday, I'd like to share with you a fundraiser for the Optic Dragons unit - a specialized FPV drone assault unit of the 92nd Separate Assault Brigade.
They're raising funds for optical fiber drones, spare parts for converting drones to fiber optics, and supporting combat vehicles of pilot crews. The unit has been redeployed to the Pokrovsk direction where the situation is intense and they need more drone reels for optical drones.
Direct donation link:
https://send.monobank.ua/jar/7D7whfQHfF
Card number: 4441 1111 2291 2961
#donations #Ukraine
They're raising funds for optical fiber drones, spare parts for converting drones to fiber optics, and supporting combat vehicles of pilot crews. The unit has been redeployed to the Pokrovsk direction where the situation is intense and they need more drone reels for optical drones.
Direct donation link:
https://send.monobank.ua/jar/7D7whfQHfF
Card number: 4441 1111 2291 2961
#donations #Ukraine
❤4👍1
An interesting lab for an overengineered solution from AWS for Kubernetes workloads right sizing.
Should you implement it this way? I don't know. But maybe, you want to play with GitOps, AWS Bedrock and all that stuff.
Also, it's funny how they say in the beginning that having VPA and Goldilocks inside a cluster is an overhead and additional management burden and then propose to create a cluster in GHA runtime and use generative AI to address that.
#aws #kubernetes
Should you implement it this way? I don't know. But maybe, you want to play with GitOps, AWS Bedrock and all that stuff.
Also, it's funny how they say in the beginning that having VPA and Goldilocks inside a cluster is an overhead and additional management burden and then propose to create a cluster in GHA runtime and use generative AI to address that.
#aws #kubernetes
Amazon
Kubernetes right-sizing with metrics-driven GitOps automation | Amazon Web Services
In this post, we introduce an automated, GitOps-driven approach to resource optimization in Amazon EKS using AWS services such as Amazon Managed Service for Prometheus and Amazon Bedrock. The solution helps optimize Kubernetes resource allocation through…
❤2😁1🤔1
For people nostalgic for on-premise setups, Dropbox reviled their new generation hardware setup and the challenges they face storing exabytes of data.
#on_prem
#on_prem
dropbox.tech
Seventh-generation server hardware at Dropbox: our most efficient and capable architecture yet
This generation represents our most efficient, capable, and scalable architecture yet—and it’ll help us as we continue to build AI products like Dropbox Dash.
🤔5😁1
Press F to pay respects.
>>> Ingress NGINX Retirement: Kubernetes SIG Network and the Security Response Committee are announcing the upcoming retirement of Ingress NGINX. Best-effort maintenance will continue until March 2026. Afterward, there will be no further releases, no bugfixes, and no updates to resolve any security vulnerabilities that may be discovered. Existing deployments of Ingress NGINX will continue to function and installation artifacts will remain available.
Announcement page.
#kubernetes #nginx
>>> Ingress NGINX Retirement: Kubernetes SIG Network and the Security Response Committee are announcing the upcoming retirement of Ingress NGINX. Best-effort maintenance will continue until March 2026. Afterward, there will be no further releases, no bugfixes, and no updates to resolve any security vulnerabilities that may be discovered. Existing deployments of Ingress NGINX will continue to function and installation artifacts will remain available.
Announcement page.
#kubernetes #nginx
GitHub
GitHub - kubernetes/ingress-nginx: Ingress NGINX Controller for Kubernetes
Ingress NGINX Controller for Kubernetes. Contribute to kubernetes/ingress-nginx development by creating an account on GitHub.
🫡33😱6❤1
A new issue of the CatOps Digest is here:
https://newsletter.catops.dev/p/catops-digest-2025-11-14
#digest #newsletter
https://newsletter.catops.dev/p/catops-digest-2025-11-14
#digest #newsletter
newsletter.catops.dev
CatOps Digest 2025-11-14
What was on CatOps in the last couple of weeks...
🔥2🤔1🤨1
For today's Donations Monday, I'd like to ask you to donate to the administrative needs of the "Come Back Alive" foundation.
It takes tremendous effort to run a foundation like this, and despite they can, they do not take money for the operational needs from regular donations. Thus, it's important to help them cover those needs as well!
https://savelife.in.ua/en/donate-en/#donate-fund-card-once
#donations #Ukraine
It takes tremendous effort to run a foundation like this, and despite they can, they do not take money for the operational needs from regular donations. Thus, it's important to help them cover those needs as well!
https://savelife.in.ua/en/donate-en/#donate-fund-card-once
#donations #Ukraine
👍10
We don't know why Cloudflare is down - their status page is not so detailed as one of AWS.
However, you can still check out some books on Humble Bundle:
- Data engineering & data science by O'Reilly.
- Software architecture by Pearson.
#books #bundle
However, you can still check out some books on Humble Bundle:
- Data engineering & data science by O'Reilly.
- Software architecture by Pearson.
#books #bundle
Humble Bundle
Humble Tech Book Bundle: Data Engineering & Science by O'Reilly
Become an expert on data science and software engineering for this library of ebooks from O’Reilly! All purchases support Code for America!
❤3
A postmortem from Cloudflare for yesterday’s outage is now available.
tl;dr:
>>>
The issue was not caused, directly or indirectly, by a cyber attack or malicious activity of any kind.Instead, it was triggered by a change to one of our database systems' permissions which caused the database to output multiple entries into a “feature file” used by our Bot Management system. That feature file, in turn, doubled in size. The larger-than-expected feature file was then propagated to all the machines that make up our network.
<<<
Another interesting thing:
>>>
Unrelated to this incident, we were and are currently migrating our customer traffic to a new version of our proxy service, internally known as FL2. Both versions were affected by the issue, although the impact observed was different.
Customers deployed on the new FL2 proxy engine, observed HTTP 5xx errors. Customers on our old proxy engine, known as FL, did not see errors, but bot scores were not generated correctly, resulting in all traffic receiving a bot score of zero. Customers that had rules deployed to block bots would have seen large numbers of false positives. Customers who were not using our bot score in their rules did not see any impact.
<<<
So, if you were not affected yesterday, you know why now.
#postmortem #cloudflare
tl;dr:
>>>
The issue was not caused, directly or indirectly, by a cyber attack or malicious activity of any kind.Instead, it was triggered by a change to one of our database systems' permissions which caused the database to output multiple entries into a “feature file” used by our Bot Management system. That feature file, in turn, doubled in size. The larger-than-expected feature file was then propagated to all the machines that make up our network.
<<<
Another interesting thing:
>>>
Unrelated to this incident, we were and are currently migrating our customer traffic to a new version of our proxy service, internally known as FL2. Both versions were affected by the issue, although the impact observed was different.
Customers deployed on the new FL2 proxy engine, observed HTTP 5xx errors. Customers on our old proxy engine, known as FL, did not see errors, but bot scores were not generated correctly, resulting in all traffic receiving a bot score of zero. Customers that had rules deployed to block bots would have seen large numbers of false positives. Customers who were not using our bot score in their rules did not see any impact.
<<<
So, if you were not affected yesterday, you know why now.
#postmortem #cloudflare
The Cloudflare Blog
Cloudflare outage on November 18, 2025
Cloudflare suffered a service outage on November 18, 2025. The outage was triggered by a bug in generation logic for a Bot Management feature file causing many Cloudflare services to be affected.
🤔12👌1
It's been a while since we had simple how-to articles here. So, here you are:
How to enable the JMX port on Jenkins.
It's short and actionable, and you would be surprised to learn how many people use Jenkins till these days.
#ci #java #debug
How to enable the JMX port on Jenkins.
It's short and actionable, and you would be surprised to learn how many people use Jenkins till these days.
#ci #java #debug
Medium
Jenkins JVM monitoring with JMX remote
If you ever encountered Jenkins misbehaving, running out of memory or just curious too see whats happening inside, Enabling remote JMX…
❤6😁4
Always Be Ready to Leave (Even If You Never Do) is not about keeping your CV up-to-date or socializing with recruiters, as it may seem from the title. It’s a short article on work habits that would keep you more efficient and, probably, happy at work; even if these habits would eventually make it easier for you to quit, if you choose to.
#culture
#culture
Andrea Canton
Always Be Ready to Leave (Even If You Never Do) ~ Andrea Canton
After 7 years at the same company, I'm moving on. But the practices that made my exit smooth aren't exit strategies—they're professional habits everyone should build, whether they're staying or leaving.
❤15🔥2
For today’s Donations Monday, I would like to remind you about the foundation that we’ve been partnering with for DevOps Days Ukraine for years now.
UA Responders. Their specialization is medical equipment and such.
#donations #Ukraine
UA Responders. Their specialization is medical equipment and such.
#donations #Ukraine
uaresponders.org
UA Responders
Your rescue buddy
❤4
Do you have the "What went well" section in your postmortems?
Here's an argument to have one with explanation of why this is important.
tl;dr: Because while each incident is different, there is a set of skills and behaviors that allow one to improvise under pressure to mitigate an incident. These skills and behaviors can be taught as well, and your "What went well" section is also for that.
#sre #incidents
Here's an argument to have one with explanation of why this is important.
tl;dr: Because while each incident is different, there is a set of skills and behaviors that allow one to improvise under pressure to mitigate an incident. These skills and behaviors can be taught as well, and your "What went well" section is also for that.
#sre #incidents
Surfing Complexity
“What went well” is more than just a pat on the back
When writing up my impressions of the GCP incident report, Cindy Sridharan’s tweet reminded me that I failed to comment on an important part of it, how the responders brought the overloaded s…
🔥5👍2
For today’s Donations Monday, let’s help the foundations “Тихо” to raise money for FPV and Vampire drones.
https://send.monobank.ua/jar/WaFbzLzNK
This fundraiser was shared by a close friend of mine, so I trust it.
#donations #Ukraine
https://send.monobank.ua/jar/WaFbzLzNK
This fundraiser was shared by a close friend of mine, so I trust it.
#donations #Ukraine
❤3
The bot I used for years to make posts into this channel has finally died. So, it seems like I won't be able to make neat buttons anymore :\
Yet, I have a couple of time-sensitive things for y'all:
- Cybersecurity books bundle by Packt
- Hacking book bundle by No Starch Press
Another time-sensitive topic: our friends at DOU are running their winter salary survey. More participants mean more accurate results, so jump in!
https://dou.ua/goto/rJks
#security #dou
Yet, I have a couple of time-sensitive things for y'all:
- Cybersecurity books bundle by Packt
- Hacking book bundle by No Starch Press
Another time-sensitive topic: our friends at DOU are running their winter salary survey. More participants mean more accurate results, so jump in!
https://dou.ua/goto/rJks
#security #dou
Humble Bundle
Humble Tech Book Bundle: Ultimate Cybersecurity Career by Packt Encore
Jump-start your exciting new cybersecurity career with this outstanding library of tech courses. Pay what you want & support World Central Kitchen!
❤3🎉2🤔1
Ok, the bot is online again!
Yesterday, I watched a video from KubeCon NA by Denys Vasyliev (in Ukrainian), and at some point they were discussing the dusk of open source, because the major players shifted their focus towards monetization and proprietary solutions.
And just today, I learned that Minio (S3-compatible storage) has been moved into the "maintenance" mode.
Here's a discussion on Reddit about the alternatives.
#open_source #minio
Yesterday, I watched a video from KubeCon NA by Denys Vasyliev (in Ukrainian), and at some point they were discussing the dusk of open source, because the major players shifted their focus towards monetization and proprietary solutions.
And just today, I learned that Minio (S3-compatible storage) has been moved into the "maintenance" mode.
Here's a discussion on Reddit about the alternatives.
#open_source #minio
🤬3❤1
I don't know, when is the point, where we can all collectively agree that front-end frameworks have gone too far in their complexity.
Yet, here you are with the Cloudflare preliminary postmortem:
>>>
A change made to how Cloudflare's Web Application Firewall parses requests caused Cloudflare's network to be unavailable for several minutes this morning. This was not an attack; the change was deployed by our team to help mitigate the industry-wide vulnerability disclosed this week in React Server Components. We will share more information as we have it today.
<<<
https://www.cloudflarestatus.com/incidents/lfrm31y6sw9q
#cloudflare #postmortem
Yet, here you are with the Cloudflare preliminary postmortem:
>>>
A change made to how Cloudflare's Web Application Firewall parses requests caused Cloudflare's network to be unavailable for several minutes this morning. This was not an attack; the change was deployed by our team to help mitigate the industry-wide vulnerability disclosed this week in React Server Components. We will share more information as we have it today.
<<<
https://www.cloudflarestatus.com/incidents/lfrm31y6sw9q
#cloudflare #postmortem
❤7🔥1
At least Cloudflare is fast in sharing their postmortems.
https://blog.cloudflare.com/5-december-2025-outage/
A curious thing is this:
>>>
Customers that have their web assets served by our older FL1 proxy AND had the Cloudflare Managed Ruleset deployed were impacted. All requests for websites in this state returned an HTTP 500 error, with the small exception of some test endpoints such as /cdn-cgi/trace.
<<<
IIRC, in the previous incident on Nov 18, only the customers on the newer proxy version were impacted. So, one could say that Cloudflare had a single time-distributed total outage.
Another important thing:
>>>
Before the end of next week we will publish a detailed breakdown of all the resiliency projects underway, including the ones listed above. While that work is underway, we are locking down all changes to our network in order to ensure we have better mitigation and rollback systems before we begin again.
<<<
Honestly, looking forward to seeing the write-up. I can only imagine how stressed their team is after taking down a big chunk of the Internet twice in less than 30 days.
#cloudflare #postmortem
https://blog.cloudflare.com/5-december-2025-outage/
A curious thing is this:
>>>
Customers that have their web assets served by our older FL1 proxy AND had the Cloudflare Managed Ruleset deployed were impacted. All requests for websites in this state returned an HTTP 500 error, with the small exception of some test endpoints such as /cdn-cgi/trace.
<<<
IIRC, in the previous incident on Nov 18, only the customers on the newer proxy version were impacted. So, one could say that Cloudflare had a single time-distributed total outage.
Another important thing:
>>>
Before the end of next week we will publish a detailed breakdown of all the resiliency projects underway, including the ones listed above. While that work is underway, we are locking down all changes to our network in order to ensure we have better mitigation and rollback systems before we begin again.
<<<
Honestly, looking forward to seeing the write-up. I can only imagine how stressed their team is after taking down a big chunk of the Internet twice in less than 30 days.
#cloudflare #postmortem
The Cloudflare Blog
Cloudflare outage on December 5, 2025
Cloudflare experienced a significant traffic outage on December 5, 2025, starting approximately at 8:47 UTC. The incident lasted approximately 25 minutes before resolution. We are sorry for the impact that it caused to our customers and the Internet. The…
👍5🔥2
This isn't a technical article, but still an important one, I would say. This one is about the importance of making your work visible.
Shadow work in engineering teams.
For better or worse, in many companies, promotion cycle is the popularity contest, therefore you need to act accordingly.
This article is aimed at the managers, but you may find it useful as an individual contributor as well.
#culture
Shadow work in engineering teams.
For better or worse, in many companies, promotion cycle is the popularity contest, therefore you need to act accordingly.
This article is aimed at the managers, but you may find it useful as an individual contributor as well.
#culture
newsletter.manager.dev
Shadow work in engineering teams
And the price your team pays for it
❤13👍1