The Data Engineering Interview Study Guide | by SeattleDataGuy | Apr, 2021 | Better Programming
https://betterprogramming.pub/the-data-engineering-interview-study-guide-6f09420dd972
https://betterprogramming.pub/the-data-engineering-interview-study-guide-6f09420dd972
Medium
The Data Engineering Interview Study Guide
For your FAANG and other technical interviews
PowerBI-Advanced-Analytics-with-PowerBI-white-paper.pdf
1.4 MB
PowerBI-Advanced-Analytics-with-PowerBI-white-paper.pdf
KNIME Analytics Platform is the open source software for creating data science. Intuitive, open, and continuously integrating new developments, KNIME makes understanding data and designing data science workflows and reusable components accessible to everyone.
https://www.knime.com/knime-analytics-platform
https://www.knime.com/knime-analytics-platform
KNIME
KNIME Analytics Platform | KNIME
KNIME Analytics Platform is free and open source, which ensures users remain on the bleeding edge of data science, 300+ connectors to data sources, and integrations to all popular machine learning libraries.
KNIME Analytics Platform is the “killer app” for machine learning and statistics | by SJ Porter | Towards Data Science
https://towardsdatascience.com/knime-desktop-the-killer-app-for-machine-learning-cb07dbef1375
https://towardsdatascience.com/knime-desktop-the-killer-app-for-machine-learning-cb07dbef1375
Medium
KNIME Analytics Platform is the “killer app” for machine learning and statistics
A free, easy, and open-source tool for all things data? Yes, please!
Data Engineering with Python 2020.epub
29.6 MB
Data Engineering with Python, Paul Crickard, 2020
What you will learn
- Understand how data engineering supports data science workflows
- Discover how to extract data from files and databases and then clean, transform, and enrich it
- Configure processors for handling different file formats as well as both relational and NoSQL databases
- Find out how to implement a data pipeline and dashboard to visualize results
- Use staging and validation to check data before landing in the warehouse
- Build real-time pipelines with staging areas that perform validation and handle failures
- Get to grips with deploying pipelines in the production environment
What you will learn
- Understand how data engineering supports data science workflows
- Discover how to extract data from files and databases and then clean, transform, and enrich it
- Configure processors for handling different file formats as well as both relational and NoSQL databases
- Find out how to implement a data pipeline and dashboard to visualize results
- Use staging and validation to check data before landing in the warehouse
- Build real-time pipelines with staging areas that perform validation and handle failures
- Get to grips with deploying pipelines in the production environment
Gartner Peer Insights ‘Voice of the Customer’: Data Preparation Tools
https://www.gartner.com/doc/reprints?id=1-24Q79VXU&ct=201202&st=sb
https://www.gartner.com/doc/reprints?id=1-24Q79VXU&ct=201202&st=sb
End-to-End BI Project: Strategy, Steps, Processes, and Tools Part-01 | by Yemunn Soe | Geek Culture | May, 2021 | Medium
https://medium.com/geekculture/end-to-end-bi-project-strategy-steps-processes-and-tools-part-1-1f8c3f8cb00c
https://medium.com/geekculture/end-to-end-bi-project-strategy-steps-processes-and-tools-part-1-1f8c3f8cb00c
Medium
End-to-End BI Project: Strategy, Steps, Processes, and Tools [Part-01]
I have come up with an idea to create a series of articles covering a small-scale end-to-end Business Intelligence (BI) Project. I will…
Какую колоночную базу данных вы можете посоветовать?
Anonymous Poll
23%
Clickhouse
5%
Apache Druid
1%
Apache Kylin
71%
Не знаю / посмотреть ответы
Real-Time Analytics and Monitoring Dashboards with Apache Kafka and Rockset
In the early days, many companies simply used Apache Kafka® for data ingestion into Hadoop or another data lake. However, Apache Kafka is more than just messaging. The significant difference today is that companies use Apache Kafka as an event streaming platform for building mission-critical infrastructures and core operations platforms. Examples include microservice architectures, mainframe integration, instant payment, fraud detection, sensor analytics, real-time monitoring, and many more—driven by business value.
https://www.confluent.io/blog/analytics-with-apache-kafka-and-rockset/
In the early days, many companies simply used Apache Kafka® for data ingestion into Hadoop or another data lake. However, Apache Kafka is more than just messaging. The significant difference today is that companies use Apache Kafka as an event streaming platform for building mission-critical infrastructures and core operations platforms. Examples include microservice architectures, mainframe integration, instant payment, fraud detection, sensor analytics, real-time monitoring, and many more—driven by business value.
https://www.confluent.io/blog/analytics-with-apache-kafka-and-rockset/
Confluent
Real-Time Analytics and Monitoring Dashboards with Kafka and Rockset
Use SQL to connect Rockset and Apache Kafka for ingesting data streams, joining datasets, and creating a real-time dashboard for streaming analytics.
Прикольная подборочка
https://github.com/awesomedata/awesome-public-datasets
https://github.com/awesomedata/awesome-public-datasets
GitHub
GitHub - awesomedata/awesome-public-datasets: A topic-centric list of HQ open datasets.
A topic-centric list of HQ open datasets. Contribute to awesomedata/awesome-public-datasets development by creating an account on GitHub.