Завтра в 12 трансляция
https://youtu.be/jF3YemOVofQ
https://youtu.be/jF3YemOVofQ
YouTube
Обработка данных на Apache Airflow в Yandex Cloud
Для анализа данных в облаке не достаточно СУБД и средств визуализации — нужен ещё и понятный инструмент, который автоматизирует сбор, подготовку и обработку данных. На вебинаре мы расскали о таком сервисе — Apache Airflow.
Эксперты Yandex Cloud обсудили:…
Эксперты Yandex Cloud обсудили:…
Как собрать платформу обработки данных «своими руками»?
@devops_dataops
https://habr.com/ru/company/itsumma/blog/679516/
@devops_dataops
https://habr.com/ru/company/itsumma/blog/679516/
Хабр
Как собрать платформу обработки данных «своими руками»?
Большое количество российских компаний столкнулись с ограничениями в области ПО. Они теперь не имеют возможности использовать многие важные инструменты для работы с данными. Но, как говорится, одна...
Nico_Loubser_Software_Engineering_for_Absolute_Beginners_Your_Guide.epub
1.5 MB
Software Engineering for Absolute Beginners - 2021
What You Will Learn
🔹 Explore the concepts that you will encounter in the majority of companies doing software development
🔹 Create readable code that is neat as well as well-designed
🔹 Build code that is source controlled, containerized, and deployable
🔹 Secure your codebase
🔹 Optimize your workspace
What You Will Learn
🔹 Explore the concepts that you will encounter in the majority of companies doing software development
🔹 Create readable code that is neat as well as well-designed
🔹 Build code that is source controlled, containerized, and deployable
🔹 Secure your codebase
🔹 Optimize your workspace
🔥 Awesome Docker Compose samples
These samples provide a starting point for how to integrate different services using a Compose file and to manage their deployment with Docker Compose.
👉 @devops_dataops
https://github.com/docker/awesome-compose
These samples provide a starting point for how to integrate different services using a Compose file and to manage their deployment with Docker Compose.
👉 @devops_dataops
https://github.com/docker/awesome-compose
GitHub
GitHub - docker/awesome-compose: Awesome Docker Compose samples
Awesome Docker Compose samples. Contribute to docker/awesome-compose development by creating an account on GitHub.
ETL Pipeline with Airflow, Spark, s3, MongoDB and Amazon Redshift
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
https://github.com/renatootescu/ETL-pipeline
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
https://github.com/renatootescu/ETL-pipeline
GitHub
GitHub - renatootescu/ETL-pipeline: Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated…
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow. - renatootescu/ETL-pipeline
GitHub - martandsingh/ApacheSpark: This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
https://github.com/martandsingh/ApacheSpark
https://github.com/martandsingh/ApacheSpark
GitHub
GitHub - martandsingh/ApacheSpark: This repository will help you to learn about databricks concept with the help of examples. It…
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We ...
👍1
Проектирование ETL-пайплайна в Apache Airflow / Хабр
https://habr.com/ru/company/otus/blog/679402/
https://habr.com/ru/company/otus/blog/679402/
Хабр
Проектирование ETL-пайплайна в Apache Airflow
Привет, Хабр! На связи Рустем, IBM Senior DevOps Engineer и сегодня я хотел бы продолжить наше знакомство с инструментом в DataOps инженирии — Apache Airflow. Сегодня мы спроектируем ETL-пайплайн. Не...
Глубокое погружение в Data Quality / Хабр
https://habr.com/ru/company/vk/blog/674876/
https://habr.com/ru/company/vk/blog/674876/
Mara Pipelines
This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. It has a number of baked-in assumptions/ principles:
- Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code.
- PostgreSQL as a data processing engine.
- Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines.
- GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows.
- No in-app data processing: command line tools as the main tool for interacting with databases and data.
- Single machine pipeline execution based on Python's multiprocessing. No need for distributed task queues. Easy debugging and output logging.
- Cost based priority queues: nodes with higher cost (based on recorded run times) are run first.
https://github.com/mara/mara-pipelines
This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. It has a number of baked-in assumptions/ principles:
- Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code.
- PostgreSQL as a data processing engine.
- Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines.
- GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows.
- No in-app data processing: command line tools as the main tool for interacting with databases and data.
- Single machine pipeline execution based on Python's multiprocessing. No need for distributed task queues. Easy debugging and output logging.
- Cost based priority queues: nodes with higher cost (based on recorded run times) are run first.
https://github.com/mara/mara-pipelines
GitHub
GitHub - mara/mara-pipelines: A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow - mara/mara-pipelines
Open Source Guides
Open source software is made by people just like you. Learn how to launch and grow your project.
https://opensource.guide/
Open source software is made by people just like you. Learn how to launch and grow your project.
https://opensource.guide/
Open Source Guides
Learn how to launch and grow your project.
Automate without limits n8n
The workflow automation platform that doesn't box you in, that you never outgrow
GitHub 27k+
Usage
🔹 Learn how to install and use it from the command line
🔹 Learn how to run n8n in Docker
Self-Hosted -> Free
🔹 Data stays on your infrastructure
🔹 Open & extendable
🔹 One-line npm command or Docker deployment
Habr: n8n. Автоматизация ИБ со вкусом смузи
The workflow automation platform that doesn't box you in, that you never outgrow
GitHub 27k+
Usage
🔹 Learn how to install and use it from the command line
🔹 Learn how to run n8n in Docker
Self-Hosted -> Free
🔹 Data stays on your infrastructure
🔹 Open & extendable
🔹 One-line npm command or Docker deployment
Habr: n8n. Автоматизация ИБ со вкусом смузи
GitHub
GitHub - n8n-io/n8n: Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code…
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations. - n8n-io/n8n
GitHub - ClickHouse/clickhouse-presentations: Presentations, meetups and talks about ClickHouse
https://github.com/ClickHouse/clickhouse-presentations
https://github.com/ClickHouse/clickhouse-presentations
GitHub
GitHub - ClickHouse/clickhouse-presentations: Presentations, meetups and talks about ClickHouse
Presentations, meetups and talks about ClickHouse. Contribute to ClickHouse/clickhouse-presentations development by creating an account on GitHub.