Practical Python Data Wrangling and Data Quality (en).epub
5.4 MB
Practical Python: Data Wrangling and Data Quality
1. Introduction to Data Wrangling and Data Quality
2. Introduction to Python
3. Understanding Data Quality
4. Working with File-Based and Feed-Based Data in Python
5. Accessing Web-Based Data
6. Assessing Data Quality
7. Cleaning, Transforming, and Augmenting Data
8. Structuring and Refactoring Your Code
9. Introduction to Data Analysis
10. Presenting Your Data
11. Beyond Python
https://github.com/PracticalPythonDataWranglingAndQuality/data_wrangling_exercises - This repo contains draft coding exercises for the early-release version of the book Practical Python: Data Wrangling and Data Quality to be published by O'Reilly Media in 2021.
1. Introduction to Data Wrangling and Data Quality
2. Introduction to Python
3. Understanding Data Quality
4. Working with File-Based and Feed-Based Data in Python
5. Accessing Web-Based Data
6. Assessing Data Quality
7. Cleaning, Transforming, and Augmenting Data
8. Structuring and Refactoring Your Code
9. Introduction to Data Analysis
10. Presenting Your Data
11. Beyond Python
https://github.com/PracticalPythonDataWranglingAndQuality/data_wrangling_exercises - This repo contains draft coding exercises for the early-release version of the book Practical Python: Data Wrangling and Data Quality to be published by O'Reilly Media in 2021.
Forwarded from Data Coffee
В новом выпуске подкаста мы обсудили нового главу всея твиттера (а пропустить мы это не могли) и то, что теперь можно легально (правда, пока только в США) скрейпить данные и обучать свои нейроболталки (чтобы потом сделать стартап и войти в сотню самых интересных Big Data компаний).
Помимо дел бизнеса, обсудили современную хирургию, несовременный шутер и неожиданную объединяющую ведущих любовь к дирижаблям.
Все это в новом свежем новостном эпизоде подкаста Data Coffee🎙
#datacoffee #data #podcast #данные #подкаст
https://anchor.fm/data-coffee/episodes/46-S2E4----Mute--Twitter----etc-e1hrnkf
Помимо дел бизнеса, обсудили современную хирургию, несовременный шутер и неожиданную объединяющую ведущих любовь к дирижаблям.
Все это в новом свежем новостном эпизоде подкаста Data Coffee🎙
#datacoffee #data #podcast #данные #подкаст
https://anchor.fm/data-coffee/episodes/46-S2E4----Mute--Twitter----etc-e1hrnkf
Spotify for Podcasters
46 (S2E4). Дирижабли, кнопка Mute, Twitter Илона Маска, etc. by Data Coffee
Ведущие подкаста "Data Coffee" обсуждают новости и делятся своими мыслями!
Shownotes:
00:31 Mute работает или все же нет
09:10 Легализация web scraping
13:16 Илон Маск всея Твиттера
18:55 Перчатки для обучения хирургов
23:15 Top 100 big data companies
25:45…
Shownotes:
00:31 Mute работает или все же нет
09:10 Легализация web scraping
13:16 Илон Маск всея Твиттера
18:55 Перчатки для обучения хирургов
23:15 Top 100 big data companies
25:45…
How to Add Value as a Data Analyst | by Cassie Kozyrkov | May, 2022 | Towards Data Science
https://towardsdatascience.com/how-to-add-value-as-a-data-analyst-8a6ae900b82a
https://towardsdatascience.com/how-to-add-value-as-a-data-analyst-8a6ae900b82a
Medium
How to Add Value as a Data Analyst
The journey to becoming a “real” data analyst
Forwarded from Инжиниринг Данных (Dmitry)
Сегодня по расписанию будет - Python Environments and Best Practices
- Using the command line and command line applications
- How to set up projects using virtual environments
- Sharing code via git and GitHub
- Using IDE features for debugging, refactoring, and navigating Python code
В приложении презентация, ссылка на git, и reference document.
- Using the command line and command line applications
- How to set up projects using virtual environments
- Sharing code via git and GitHub
- Using IDE features for debugging, refactoring, and navigating Python code
В приложении презентация, ссылка на git, и reference document.
Forwarded from Инжиниринг Данных (Dmitry)
Data_Quality_Fundamentals_Barr_Moses_Lior_Gavish_Molly_Vorwerck.epub
1.4 MB
Data Quality Fundamentals
2022 O'Reilly Media, Inc.
- Build more trustworthy and reliable data pipelines
- Write scripts to make data checks and identify broken pipelines with data observability
- Program your own data quality monitors from scratch
- Develop and lead data quality initiatives at your company
- Generate a dashboard to highlight your company's key data assets
- Automate data lineage graphs across your data ecosystem
- Build anomaly detectors for your critical data assets
2022 O'Reilly Media, Inc.
- Build more trustworthy and reliable data pipelines
- Write scripts to make data checks and identify broken pipelines with data observability
- Program your own data quality monitors from scratch
- Develop and lead data quality initiatives at your company
- Generate a dashboard to highlight your company's key data assets
- Automate data lineage graphs across your data ecosystem
- Build anomaly detectors for your critical data assets
Dataviz-inspiration.com aims at being the biggest list of chart examples available on the web. It showcases 104 of the most beautiful and impactful dataviz projects I know. The collection is a good place to visit when you're designing a new graph, together with data-to-viz.com that shares dataviz best practices.
https://www.dataviz-inspiration.com/
https://www.dataviz-inspiration.com/
Dataviz-Inspiration
Dataviz Inspiration
The Pinterest of data visualization. Explore hundreds of stunning dataviz projects in a clean, organized layout. Easily searchable, filterable, and categorized by chart type for your convenience.
Curated list of project-based tutorials
https://github.com/practical-tutorials/project-based-learning
https://github.com/practical-tutorials/project-based-learning
GitHub
GitHub - practical-tutorials/project-based-learning: Curated list of project-based tutorials
Curated list of project-based tutorials. Contribute to practical-tutorials/project-based-learning development by creating an account on GitHub.
Forwarded from Airbyte - ETL ELT Data Pipelines
Understanding Change Data Capture (CDC): Definition, Methods and Benefits | Airbyte
https://airbyte.com/blog/change-data-capture-definition-methods-and-benefits
https://airbyte.com/blog/change-data-capture-definition-methods-and-benefits
Airbyte
Understanding Change Data Capture (CDC): Definition, Methods, Benefits | Airbyte
Change Data Capture Explained - Understand CDC definition, methods, and its benefits.
Forwarded from Data Engineering / Инженерия данных / Data Engineer / DWH
The Data Engineering Cookbook (MOUSAIF, YASSINE) (z-lib.org).fb2
2.4 MB
The Data Engineering Cookbook
Mastering The Plumbing Of Data Science
September 12, 2021
https://github.com/andkret/Cookbook code examples
https://cookbook.learndataengineering.com/docs/01-Introduction - free online version
Mastering The Plumbing Of Data Science
September 12, 2021
https://github.com/andkret/Cookbook code examples
https://cookbook.learndataengineering.com/docs/01-Introduction - free online version
Forwarded from Data Engineering / Инженерия данных / Data Engineer / DWH
Introduction to Data Engineering (Daniel Beach) (z-lib.org).pdf
1.9 MB
Introduction to Data Engineering (Daniel Beach).pdf
With the rise of Data Science and Machine Learning, Data Engineering is quickly becoming an in-demand skill. Data Engineering requires a unique skill set that is hard to learn without experience. I will teach you how to write scalable data pipelines and more!
Introduction
Chapter 1 - The Theory
Chapter 2 - Data Pipeline Basics
Chapter 3 - Pipeline Architecture
Chapter 4 - Storage
Chapter 5 - Compute and Resources
Chapter 6 - Mastering SQ
Chapter 7 - Data Warehousing / Data Lakes
Chapter 8 - Data Modeling
Chapter 9 - Data Quality
Chapter 10 - DevOps for Data Engineers
With the rise of Data Science and Machine Learning, Data Engineering is quickly becoming an in-demand skill. Data Engineering requires a unique skill set that is hard to learn without experience. I will teach you how to write scalable data pipelines and more!
Introduction
Chapter 1 - The Theory
Chapter 2 - Data Pipeline Basics
Chapter 3 - Pipeline Architecture
Chapter 4 - Storage
Chapter 5 - Compute and Resources
Chapter 6 - Mastering SQ
Chapter 7 - Data Warehousing / Data Lakes
Chapter 8 - Data Modeling
Chapter 9 - Data Quality
Chapter 10 - DevOps for Data Engineers
Forwarded from Airbyte - ETL ELT Data Pipelines
Урок из марафона с разбором Airbyte
https://www.youtube.com/watch?v=bzjXuilsUuU
Материалы марафона https://github.com/Arkronus/cooking-data
https://www.youtube.com/watch?v=bzjXuilsUuU
Материалы марафона https://github.com/Arkronus/cooking-data
YouTube
6 Инструменты ETL на Python
В последнем уроке вы узнаете об современных открытых инструментах и библиотек для работы с ETL. Мы рассмотрим инструмент миграции данных Airbyte, библиотеку ELT трансформации данных DBT и посмотрим как создавать ETL-процессы с помощью библиотек Pandas и Petl.…
GitHub - ozlerhakan/datacamp: 🍧 DataCamp data-science and machine learning courses
https://github.com/ozlerhakan/datacamp
https://github.com/ozlerhakan/datacamp
GitHub
GitHub - ozlerhakan/datacamp: 🍧 DataCamp data-science and machine learning courses
🍧 DataCamp data-science and machine learning courses - ozlerhakan/datacamp
Forwarded from cube.dev & cube.js
Как построить систему аналитики на open-source — туториал по cube.js
https://habr.com/ru/post/658581/
https://habr.com/ru/post/658581/
Хабр
Как построить систему аналитики на open-source — туториал по cube.js
Сube (до недавнего времени cube.js) относительно молодой проект (первый релиз март 2019) - реализация концепции OLAP-куб. Несмотря на отличную документацию , в интернете пока что мало информации на...
Forwarded from Apache Superset BI
Apache Superset 1.5: Release Notes | Preset
https://preset.io/blog/apache-superset-1-5-release-notes/
https://preset.io/blog/apache-superset-1-5-release-notes/
preset.io
Apache Superset 1.5: Release Notes
Apache Superset 1.5 is now released! Superset 1.5 focuses on polishing the dashboard native filters experience, while improving performance and stability.
Embed Power BI analytics - Learn | Microsoft Docs
https://docs.microsoft.com/en-us/learn/paths/power-bi-embedded/
https://docs.microsoft.com/en-us/learn/paths/power-bi-embedded/
Docs
Embed Power BI analytics - Training
This learning path teaches you how to embed Power BI content in apps, develop programmatic solutions using the Power BI REST API and the Power BI Client APIs, enforce row-level security (RLS) for embedded content, automate common Power BI setup tasks, configure…
Forwarded from Инжиниринг Данных (Dmitry)
dbt известны своей относительной простотой, так как дает возможность создавать все трансформации данных на SQL. Согласно их roadmap на 2022 год, они добавляют поддержку Python - Python-language dbt models
GitHub
dbt-core/docs/roadmap/2022-05-dbt-a-core-story.md at main · dbt-labs/dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. - dbt-labs/dbt-core