Мониторим ИТ

InfluxDB Monitoring With ZABBIX Made Simple

Свежее видео с канала Дмитрия Ламберта.

Step by step tutorial on how you can configure InfluxDB monitoring and data collection with Open source monitoring tool Zabbix. Simple installation and configuration to start pulling the first metrics from your InfluxDB instance. Display it in your Zabbix…

2.61K views12:19

Мониторим ИТ

Что нового в плане мониторинга в PostgreSQL (Алексей Лесовский)

Расшифровка доклада Алексея Лесовского про то, что нового есть в PostgreSQL в плане мониторинга. Читать на Хабре.

2.68K views08:12

Мониторим ИТ

How to reduce your Prometheus cost

Пишут как снизили количество инжестируемых метрик в Prometheus. Читать на Медиуме.

3.83K views09:00

Мониторим ИТ

How Prometheus Operator facilitates Prometheus configuration updates

The goal: Update Prometheus configuration nicely! Читать дальше на Медиуме.

2.52K views11:00

Мониторим ИТ

Using Environment Variables for Configuration, Provisioning, and Dashboards in Grafana

The number of use cases operating Grafana as a platform to build modern applications is increasing. Compared to a single central Grafana instance, we are looking at multiple distributed installations with new kinds of data sources. Читать дальше на Медиуме.

2.63K views16:30

Мониторим ИТ

Обход аутентификации и способы выполнения произвольного кода в ZABBIX

В этой статье мы поговорим о некоторых атаках на систему мониторинга Zabbix и рассмотрим сценарии удаленного выполнения кода (RCE). Дальше на Хабре.

2.72K views05:00

Мониторим ИТ

Grafana и автотесты: учимся измерять работу тестов

Grafana позволяет собрать на одном экране разную информацию:
⚡️результаты тестов в режиме реального времени,
⚡️срезы по окружениям, браузерам и чему угодно ещё,
⚡️скорость выполнения тестов,
⚡️покрытие тестами страниц и действий на них,
⚡️результаты релизов.

На примерах тестов вы узнаете, как Grafana помогает в анализе результатов автотестирования, чтобы точнее понимать, что происходит. Читать дальше на Хабре.

3.05K views09:00

Мониторим ИТ

Installing Grafana plugins from a Private repository

Grafana Marketplace application is one of our favorite features introduced in Grafana 8. It allows installing registered plugins from the official Grafana repository when connected to the Internet, but how to upgrade and manage Grafana plugins without access to external network? Читать дальше.

2.81K views11:00

Мониторим ИТ

How we scaled our new Prometheus TSDB Grafana Mimir to 1 billion active series

Полторы недели назад Grafana анонсировала собственную TSDB Mimir, и вот теперь рассказывает как они затестили Mimir с миллиардом серий данных.

Блог Grafana

2.67K views13:35

Мониторим ИТ

How relabeling in Prometheus works

Relabeling is a powerful tool that allows you to classify and filter Prometheus targets and metrics by rewriting their label set. Блог Grafana.

Grafana Labs

How relabeling in Prometheus works | Grafana Labs

Relabeling in Prometheus is a powerful tool that allows you to classify and filter targets and metrics.

2.95K views18:00

Мониторим ИТ

How summary metrics work in Prometheus

A summary is a metric type in Prometheus that can be used to monitor latencies (or other distributions like request sizes). For example, when you monitor a REST endpoint you can use a summary and configure it to provide the 95th percentile of the latency. If that percentile is 120ms that means that 95% of the calls were faster than 120ms, and 5% were slower. Читать дальше.

4.37K views05:00

Мониторим ИТ

How To Troubleshoot Slow Linux Servers

atop, free, ncdu, iotop и nethogs

4.8K views10:10

Мониторим ИТ

5 Network Performance and Analysis Tools For Linux

iperf, tcpdump, hping, netstat и scapy

4.95K views13:30

Мониторим ИТ

SRE Revisited: SLO in the Age of Microservices

Еще раз о SLI, SLA, SLO, Error Budget и всём таком + видео

Medium

SRE Revisited: SLO in the Age of Microservices

Site Reliability Engineering practice was established by Google nearly 20 years ago. How to apply to microservices and cloud native…

2.72K views08:29

Мониторим ИТ

Упрощаем мониторинг и управление контейнерами Docker при помощи инструментов CLI

Dockly, Dive, Ctop, Dry, Lazy Docker, Poco, Sen и Skopeo.

4.74K views07:16

Мониторим ИТ

Intro to metrics with Grafana: Prometheus, Grafana Mimir, Graphite, and beyond

Вебинар завтра в 19:30 МСК. Регистрация.

Grafana Labs

Intro to metrics with Grafana: Prometheus, Grafana Mimir, and beyond | Grafana Labs

In this webinar, we’ll go over challenges when scaling metrics systems, with a particular focus on Prometheus and Grafana Mimir.

2.61K views18:30

Мониторим ИТ

How to drop and delete metrics in Prometheus

Keeping your Prometheus optimized can be a tedious task over time, but it’s essential in order to maintain the stability of it and also to keep the cardinality under control. Identifying the unnecessary metrics at source, deleting the existing unneeded metrics from your TSDB regularly will keep your Prometheus storage & performance intact.

In this article we’ll look at both identifying, dropping them at source and deleting the already stored metrics from Prometheus.

Читать дальше на Медиуме.

3.65K views07:25

Мониторим ИТ

Культура postmortems или как мы учимся на ̶с̶в̶о̶и̶х̶ факапах

Где-то три года назад я выступал на небольшом митапе с темой, которая вынесена в название этой статьи. В том докладе я рассказывал о том, как мы за несколько лет выстроили работу с инцидентами у себя в привлечении Tinkoff. Ну и чтобы доклад был не таким скучным я поделился несколькими postmortems, которые произошли в командах “моего друга”. Читать дальше.

3.88K views13:30

Мониторим ИТ

Calculating composite SLA

How to serial and parallel dependencies affect the total SLA. Читать дальше.

2.59K views07:18

Мониторим ИТ

15 months of 24x7 Primary On-Call — Here’s How I Survived

I just finished 15 months of primary 24x7 on call. Although it is always stressful to be paged in the middle of the night or on a weekend or holiday I was able to lean on my SRE background to ensure that every alert that woke me up faithfully indicated a critical issue with our system and required human intervention. Here’s how I did it. Читать дальше.

3.65K views08:00

About

Blog

Apps

Platform