Data Engineering / Инженерия данных / Data Engineer / DWH
1.95K subscribers
49 photos
7 videos
52 files
350 links
Data Engineering: ETL / DWH / Data Pipelines based on Open-Source software. Инженерия данных.

DWH / SQL
Python / ETL / ELT / dbt / Spark
Apache Airflow

Рекламу не размещаю
Вопросы: @iv_shamaev | datatalks.ru
Download Telegram
Data Engineering Wiki

It contains a constantly evolving collection of topics related to data engineering. Since we're at a very early stage, there's a lot of space to grow!

https://dataengineering.wiki/
🔥2👍1
Еще один open-source проект, который в первую очередь предназначен для команд, которые работают с dbt

██████╗░██████╗░████████╗
██╔══██╗██╔══██╗╚══██╔══╝
██║░░██║██████╦╝░░░██║░░░
██║░░██║██╔══██╗░░░██║░░░
██████╔╝██████╦╝░░░██║░░░
╚═════╝░╚═════╝░░░░╚═╝░░░

Open-source data observability for analytics engineers

💬 Data anomalies monitoring as dbt tests - Collect metrics and metadata over time, detect anomalies, as native dbt tests in your project!
💬 Data observability report - Generate a report for all dbt tests and share with your team.
💬 dbt artifacts uploader
💬 Slack alerts
💬 Data lineage made simple, reliable, and automated

👉 @devops_dataops

https://github.com/elementary-data/elementary
👍1
Deep Dive on ClickHouse Sharding and Replication Webinar

Join the Altinity experts as we dig into ClickHouse sharding and replication, showing how they enable clusters that deliver fast queries over petabytes of data. We’ll start with basic definitions of each, then move to practical issues. This includes the setup of shards and replicas, defining schema, choosing sharding keys, loading data, and writing distributed queries. We’ll finish up with tips on performance optimization.

#ClickHouse

👉 @devops_dataops

https://www.youtube.com/watch?v=Vuh6NOluIxo
👍1
Data Engineering

A collection of one-off topics or videos that do not fall neatly into any other existing playlist.

1. A Brief History of Data Engineering | What is Data Engineering?
2. How to Become a Data Engineer (with no experience)
3. ETL vs ELT | Modern Data Architectures
4. YAML Tutorial | Learn YAML in 10 Minutes
5. What is Data Streaming?
6. 3 Must-Know Trends for Data Engineers | DataOps
7. What skills do you need as a Data Engineer?
8. What is Reverse ETL?
9. What tools should you know as a Data Engineer?
10. Intro to BASH // Command Line for Beginners
11. Getting Started w/ Airbyte! | Open Source Data Integration
12. Data Warehouse vs Data Lake | Explained (non-technical)
13. Data Modeling in the Modern Data Stack
14. Getting Started w/ Metabase | Open Source Data Visualization Tool
15. What do you actually do as a data engineer?

👉 @devops_dataops

https://www.youtube.com/playlist?list=PLy4OcwImJzBKg3rmROyI_CBBAYlQISkOO
👍2
Forwarded from karpov.courses
У нас хорошие новости: мы сделали бесплатный курс по Docker.

Docker применяется в Data Science, разработке, инженерии данных и даже тестировании! Уверены, программа будет полезна всем, кто пишет код и работает с приложениями.

Вы научитесь:

● заворачивать собственные приложения в контейнеры;
● локально разворачивать готовые сервисы: Airflow, Postgres, ClickHouse, Nginx;
● поднимать и настраивать полноценные веб-приложения.

Программа даст вам базовые знания, с которыми можно будет сделать шаг навстречу ещё более интересным инструментам — например, Kubernetes.

Автор курса – Антон Сидорин, бэкенд-разработчик karpov.соurses.
Начать учиться можно в любое удобное время.

[Познакомиться с Docker]
👍3
(DataCamp) Introduction to Airflow in Python

This is a memo to share what I have learnt in Apache Airflow, capturing the learning objectives as well as my personal notes. The course is taught by Mike Metzger from DataCamp, and it includes 4 chapters:
▫️ Intro to Airflow
▫️ Implementing Airflow DAGs
▫️ Maintaining and monitoring Airflow workflows
▫️ Building production pipelines in Airflow

https://github.com/JNYH/DataCamp_Introduction_to_Airflow

Personal Notes:
https://medium.com/swlh/introduction-to-airflow-in-python-67b554f06f0b
Через 30 минут начнётся move data конференция (в 21 по мск)

https://movedata.airbyte.com/