Python 🐍 Work With Data
1.6K subscribers
76 photos
13 videos
136 files
441 links
A collection of books and articles on Python and various data manipulation tools. Overview of architecture of business intelligence systems, design and development of BI Reports, data processing in Python Pandas.
Download Telegram
KNIME Analytics Platform is the open source software for creating data science. Intuitive, open, and continuously integrating new developments, KNIME makes understanding data and designing data science workflows and reusable components accessible to everyone.

https://www.knime.com/knime-analytics-platform
Data Engineering with Python 2020.epub
29.6 MB
Data Engineering with Python, Paul Crickard, 2020

What you will learn

- Understand how data engineering supports data science workflows
- Discover how to extract data from files and databases and then clean, transform, and enrich it
- Configure processors for handling different file formats as well as both relational and NoSQL databases
- Find out how to implement a data pipeline and dashboard to visualize results
- Use staging and validation to check data before landing in the warehouse
- Build real-time pipelines with staging areas that perform validation and handle failures
- Get to grips with deploying pipelines in the production environment
Gartner Peer Insights ‘Voice of the Customer’: Data Preparation Tools

https://www.gartner.com/doc/reprints?id=1-24Q79VXU&ct=201202&st=sb
Какую колоночную базу данных вы можете посоветовать?
Anonymous Poll
23%
Clickhouse
5%
Apache Druid
1%
Apache Kylin
71%
Не знаю / посмотреть ответы
Real-Time Analytics and Monitoring Dashboards with Apache Kafka and Rockset

In the early days, many companies simply used Apache Kafka® for data ingestion into Hadoop or another data lake. However, Apache Kafka is more than just messaging. The significant difference today is that companies use Apache Kafka as an event streaming platform for building mission-critical infrastructures and core operations platforms. Examples include microservice architectures, mainframe integration, instant payment, fraud detection, sensor analytics, real-time monitoring, and many more—driven by business value.

https://www.confluent.io/blog/analytics-with-apache-kafka-and-rockset/