NEW COURSE!!! Modern Data Warehouses no longer need you to load the data into them. Many warehouses like AWS Redshift, BigQuery or Snowflake allow you to load data directly from files in your Data Lake. This Data Lake integration is the key to flexibility of how you interact with your data. It makes a modern Data Warehouse so nice to use for all kinds of analytics workloads.
In this course you will learn how easy it is to use Data Lakes, Warehouses and BI tools. Load your files into the lake and visualize it in a report. 🚀
Course Contents
✅ Where Data Warehouses fit into a platform
✅ Data Warehouses ETL vs ELT data integration
✅ Direct Access of Data Lake?
✅ Data Warehouses and Data Lakes on AWS & GCP
✅ GCP hands on Example with Cloud Storage, BigQuery, & Data Studio
✅ AWS hands on example with S3, Glue, Athena, and Quicksight
✅ Example with AWS Redshift
https://learndataengineering.com/p/modern-data-warehouses
In this course you will learn how easy it is to use Data Lakes, Warehouses and BI tools. Load your files into the lake and visualize it in a report. 🚀
Course Contents
✅ Where Data Warehouses fit into a platform
✅ Data Warehouses ETL vs ELT data integration
✅ Direct Access of Data Lake?
✅ Data Warehouses and Data Lakes on AWS & GCP
✅ GCP hands on Example with Cloud Storage, BigQuery, & Data Studio
✅ AWS hands on example with S3, Glue, Athena, and Quicksight
✅ Example with AWS Redshift
https://learndataengineering.com/p/modern-data-warehouses
Learndataengineering
Building a Lakehouse on AWS and GCP
Moving everything to one cloud is most of the time not possible. Or should we just keep the data where it is and try to connect where possible? Could be a practical approach. Always use the right tool for the job, or is this too complicated?
Link to the video on YouTube: https://youtu.be/bsSUa1CrWqo
Link to the video on YouTube: https://youtu.be/bsSUa1CrWqo
YouTube
Is a multi-cloud strategy worth it?
Moving everything to one cloud is most of the time not possible. Or should we just keep the data where it is and try to connect where possible? Could be a practical approach.
► Learn Data Engineering at my Ultimate Data Engineering Academy!
Everything…
► Learn Data Engineering at my Ultimate Data Engineering Academy!
Everything…
Cloud billing can be very difficult to forecast. There are so many variables in play. Are you looking for a solution? In this post @Zach Quinn is showing you a way how to use BigQuery's meta data to calculate and forecast the costs. Really cool idea! Makes me wonder what other services offer this kind of meta data, too.
Here is a quick view in the article:
Data engineers can leverage SQL statements to fetch database metadata in order to calculate costs incurred with PaaS products like BigQuery.
In the article you’ll learn:
-How to access table metadata in BigQuery
-How to use standard SQL to convert bytes to GB and TB
-How to calculate per gigabyte rates
Read "How Data Engineers Can Use SQL to Estimate BigQuery Storage Costs" in our publication "Plumbers of Data Science" on Medium.
https://medium.com/plumbersofdatascience/how-data-engineers-can-use-sql-to-estimate-bigquery-storage-costs-cbcdfca18899
Here is a quick view in the article:
Data engineers can leverage SQL statements to fetch database metadata in order to calculate costs incurred with PaaS products like BigQuery.
In the article you’ll learn:
-How to access table metadata in BigQuery
-How to use standard SQL to convert bytes to GB and TB
-How to calculate per gigabyte rates
Read "How Data Engineers Can Use SQL to Estimate BigQuery Storage Costs" in our publication "Plumbers of Data Science" on Medium.
https://medium.com/plumbersofdatascience/how-data-engineers-can-use-sql-to-estimate-bigquery-storage-costs-cbcdfca18899
Medium
How Data Engineers Can Use SQL to Estimate BigQuery Storage Costs
For data engineers, SQL’s applications go beyond analysis; it can be a powerful tool for determining resource allocations.
Post about data engineering. I also answered comments here
https://www.linkedin.com/posts/christinastathopoulos_truth-dataengineering-datascience-activity-6878371323064684544-lN3K
https://www.linkedin.com/posts/christinastathopoulos_truth-dataengineering-datascience-activity-6878371323064684544-lN3K
Linkedin
#truth? | Christina Stathopoulos, MSc
#truth? I know you can relate to this Andreas Kretz! 🔧
#dataengineering #datascience | 23 comments on LinkedIn
#dataengineering #datascience | 23 comments on LinkedIn
I love the smell of some nice SQL in the morning! I keep telling you for structured data use cases SQL was and still is the gold standard. Here's another great article from Zach creating tables, doing nested queries and more.
Preview on the content:
- How to effectively use subqueries to structure complex SQL queries
- How to use dynamic date filters to avoid hard coding date ranges
- How to use conditional logic to return a decision
Read "Data Engineering IRL: How to Use SQL to Track Your Spending" in our publication "Plumbers of Data Science" on Medium.
https://medium.com/plumbersofdatascience/date-engineering-irl-how-to-use-sql-to-track-your-spending-79f47512af2b
Preview on the content:
- How to effectively use subqueries to structure complex SQL queries
- How to use dynamic date filters to avoid hard coding date ranges
- How to use conditional logic to return a decision
Read "Data Engineering IRL: How to Use SQL to Track Your Spending" in our publication "Plumbers of Data Science" on Medium.
https://medium.com/plumbersofdatascience/date-engineering-irl-how-to-use-sql-to-track-your-spending-79f47512af2b
Medium
Data Engineering IRL: How to Use SQL to Track Your Spending
How to write optimized SQL to track monthly spending in BigQuery.
Check out Benjamin's post about 2022 predictions for Data Engineering. I also sent him my 2 cents.
Make sure that you follow him on Linkedin and on YouTube. He does really great videos (I'm a bit jealous).
Ben also just quit his job at Facebook to start his own company.
BIG CONGRATS!!!!
https://seattledataguy.substack.com/p/5-big-data-experts-predictions-for
Make sure that you follow him on Linkedin and on YouTube. He does really great videos (I'm a bit jealous).
Ben also just quit his job at Facebook to start his own company.
BIG CONGRATS!!!!
https://seattledataguy.substack.com/p/5-big-data-experts-predictions-for
Substack
5 Big Data Experts Predictions For 2022 - From The Modern Data Stack To Data Science
The hype around data is not dying.
Andreas Kretz - Learn Data Engineering pinned «Check out Benjamin's post about 2022 predictions for Data Engineering. I also sent him my 2 cents. Make sure that you follow him on Linkedin and on YouTube. He does really great videos (I'm a bit jealous). Ben also just quit his job at Facebook to start…»
I’m trying out Rockset real-time analytics database today. Join me live on YouTube or LinkedIn:
https://youtu.be/0xdQNheAGe8
https://www.linkedin.com/posts/andreas-kretz_quick-test-with-example-project-activity-6884398644326490113-epN6
https://youtu.be/0xdQNheAGe8
https://www.linkedin.com/posts/andreas-kretz_quick-test-with-example-project-activity-6884398644326490113-epN6
YouTube
Trying out Rockset Real-Time Analytics Database
Quick test with example project
I just released a NEW Course Python for Data Engineers!
You come from a different field and haven't coded before? No problem! This course is picking up where our Python 1 course from @Amit Jain has ended.
You learn all the important basics a Data Engineer needs. From advanced Python features, how to transform data with pandas to Working with APIs and Postgres databases.
Kristijan Bakaric and I created hands on examples for every lesson. In 2.5 hours of videos we go through each of them together. We also prepared the source codes in our GitHub.
🚀
Course Content:
✅ Exception handling
✅ Understand what classes and objects are
✅ how to use modules
✅ Log out messages into files
✅ how to work with dates and JSONs
✅ Understand unit tests and data validation
✅ Pandas to transform your data
✅ Numpy to apply mathematical functions
✅ Working with Postgres
https://learndataengineering.com/p/python-for-data-engineers
You come from a different field and haven't coded before? No problem! This course is picking up where our Python 1 course from @Amit Jain has ended.
You learn all the important basics a Data Engineer needs. From advanced Python features, how to transform data with pandas to Working with APIs and Postgres databases.
Kristijan Bakaric and I created hands on examples for every lesson. In 2.5 hours of videos we go through each of them together. We also prepared the source codes in our GitHub.
🚀
Course Content:
✅ Exception handling
✅ Understand what classes and objects are
✅ how to use modules
✅ Log out messages into files
✅ how to work with dates and JSONs
✅ Understand unit tests and data validation
✅ Pandas to transform your data
✅ Numpy to apply mathematical functions
✅ Working with Postgres
https://learndataengineering.com/p/python-for-data-engineers
Learndataengineering
Python for Data Engineers
The new Machine Learning & Containerization on AWS project is online!! 🚀
As always, active members oft the Data Engineering Academy already have access to the course.
What’s the course about:
In this example project you learn how to create a data pipeline where you pull data from the Twitter API, analyze, store and visualize it.
You will host your Machine Learning algorithm on AWS using Lambda and setup your own postgres database with RDS. You create a Streamlit dashboard and gain experience hosting it using Elastic Container Registry (ECR) and Elastic Container Service (ECS).
This project also gives you insights on how to handle dependency management with Poetry.
Have fun!
Course Content:
✅ Setup and configure Twitter API
✅ Launching RDS Postgres DB
✅ Create S3 bucket for raw storage
✅ Create ML Lambda that extracts & analyses Tweets
✅ Schedule Lambda with Event Bridge
✅ Create Streamlit visualization app
✅ Dependency management with Poetry & create Docker image
✅ Install & configure AWS CLI
✅ Setup Elastic Container Registry ECR
✅ Create Elastic Container Service ECS Fargate cluster
✅ Run our Streamlit app as ECS task
Learn how to build a NLP pipeline and host containers in the cloud
Link to the course in the Data Engineering Academy - Trusted by over 500 students: https://learndataengineering.com/p/ml-on-aws
As always, active members oft the Data Engineering Academy already have access to the course.
What’s the course about:
In this example project you learn how to create a data pipeline where you pull data from the Twitter API, analyze, store and visualize it.
You will host your Machine Learning algorithm on AWS using Lambda and setup your own postgres database with RDS. You create a Streamlit dashboard and gain experience hosting it using Elastic Container Registry (ECR) and Elastic Container Service (ECS).
This project also gives you insights on how to handle dependency management with Poetry.
Have fun!
Course Content:
✅ Setup and configure Twitter API
✅ Launching RDS Postgres DB
✅ Create S3 bucket for raw storage
✅ Create ML Lambda that extracts & analyses Tweets
✅ Schedule Lambda with Event Bridge
✅ Create Streamlit visualization app
✅ Dependency management with Poetry & create Docker image
✅ Install & configure AWS CLI
✅ Setup Elastic Container Registry ECR
✅ Create Elastic Container Service ECS Fargate cluster
✅ Run our Streamlit app as ECS task
Learn how to build a NLP pipeline and host containers in the cloud
Link to the course in the Data Engineering Academy - Trusted by over 500 students: https://learndataengineering.com/p/ml-on-aws
Learndataengineering
ML on AWS
Join the Data Engineering Discord server! https://discord.gg/Wxy2mQA7Fy
Andreas Kretz - Learn Data Engineering pinned «Join the Data Engineering Discord server! https://discord.gg/Wxy2mQA7Fy»
Chat with Andreas how the Academy and the Full-Stack Coaching can help you:
https://t.me/+V87Cha4h1_VGR9ac
https://t.me/+V87Cha4h1_VGR9ac
Telegram
Chat with Andreas - Data Engineering Academy & Coaching
You’ve been invited to join this group on Telegram.
Andreas Kretz - Learn Data Engineering pinned «Chat with Andreas how the Academy and the Full-Stack Coaching can help you: https://t.me/+V87Cha4h1_VGR9ac»
This media is not supported in your browser
VIEW IN TELEGRAM
Channel name was changed to «Andreas Kretz - Learn Data Engineering»
New Podcast episode!
In today's episode, I’m talking with Tom Schamberger from msg. He leads their cloud data platform team and brings a ton of experience from consulting, startups, and platform design.
We talked about:
✅ The skills data engineers actually need in consulting
✅ Why soft skills are underrated
✅ Tool debates (Databricks, Snowflake, SAP…)
✅ What consulting projects really look like
and more!
Listen to the episode here: https://creators.spotify.com/pod/show/andreaskayy/episodes/125-These-Skills-Get-You-a-Data-Consulting-Job--with-Tom-Schamberger-e35at82
Or watch on YouTube: https://youtu.be/jWaVtLNYNIw
In today's episode, I’m talking with Tom Schamberger from msg. He leads their cloud data platform team and brings a ton of experience from consulting, startups, and platform design.
We talked about:
✅ The skills data engineers actually need in consulting
✅ Why soft skills are underrated
✅ Tool debates (Databricks, Snowflake, SAP…)
✅ What consulting projects really look like
and more!
Listen to the episode here: https://creators.spotify.com/pod/show/andreaskayy/episodes/125-These-Skills-Get-You-a-Data-Consulting-Job--with-Tom-Schamberger-e35at82
Or watch on YouTube: https://youtu.be/jWaVtLNYNIw
Spotify for Creators
#125 These Skills Get You a Data Consulting Job – with Tom Schamberger by Plumbers of Data Science
In this episode, I’m talking with Tom Schamberger from the German consultancy msg. He leads their cloud data platform team and has a super interesting background: started coding Java at 12, co-founded startups, and now helps big companies design scalable…