Andreas Kretz - Learn Data Engineering
1.01K subscribers
10 photos
1 video
70 links
Learn Data Engineering with Andreas Kretz
Download Telegram
The new Machine Learning & Containerization on AWS project is online!! 🚀
As always, active members oft the Data Engineering Academy already have access to the course.

What’s the course about:
In this example project you learn how to create a data pipeline where you pull data from the Twitter API, analyze, store and visualize it.
You will host your Machine Learning algorithm on AWS using Lambda and setup your own postgres database with RDS. You create a Streamlit dashboard and gain experience hosting it using Elastic Container Registry (ECR) and Elastic Container Service (ECS). 

This project also gives you insights on how to handle dependency management with Poetry.
Have fun!

Course Content:
Setup and configure Twitter API
Launching RDS Postgres DB
Create S3 bucket for raw storage
Create ML Lambda that extracts & analyses Tweets
Schedule Lambda with Event Bridge
Create Streamlit visualization app
Dependency management with Poetry & create Docker image
Install & configure AWS CLI
Setup Elastic Container Registry ECR
Create Elastic Container Service ECS Fargate cluster
Run our Streamlit app as ECS task

Learn how to build a NLP pipeline and host containers in the cloud

Link to the course in the Data Engineering Academy - Trusted by over 500 students: https://learndataengineering.com/p/ml-on-aws
Join the Data Engineering Discord server! https://discord.gg/Wxy2mQA7Fy
Andreas Kretz - Learn Data Engineering pinned «Join the Data Engineering Discord server! https://discord.gg/Wxy2mQA7Fy»
Andreas Kretz - Learn Data Engineering pinned «Chat with Andreas how the Academy and the Full-Stack Coaching can help you: https://t.me/+V87Cha4h1_VGR9ac»
This media is not supported in your browser
VIEW IN TELEGRAM
Channel name was changed to «Andreas Kretz - Learn Data Engineering»
New Podcast episode!

In today's episode, I’m talking with Tom Schamberger from msg. He leads their cloud data platform team and brings a ton of experience from consulting, startups, and platform design.
We talked about:
The skills data engineers actually need in consulting
Why soft skills are underrated
Tool debates (Databricks, Snowflake, SAP…)
What consulting projects really look like
and more!

Listen to the episode here: https://creators.spotify.com/pod/show/andreaskayy/episodes/125-These-Skills-Get-You-a-Data-Consulting-Job--with-Tom-Schamberger-e35at82
Or watch on YouTube: https://youtu.be/jWaVtLNYNIw
Some people say I don't do free content. 🤡
Here's YouTube video #477 and a FREE playground app to experience Kafka and Spark streaming.

I created this playground to show you what happens when:
- You increase the produced messages
- Kafka runs out of resources
- Spark runs out of resources
- The database you use get's in trouble and overpowered

It will help you understand the right variables to look at if you have problems.

🖥️ Watch the video here: https://youtu.be/ScG4spWqMj4?si=2Q-v8pE1nF40bJqj
🛝 Check out the playground app on GitHub Pages: https://bit.ly/4mbsLqY

🎥 The whole topics fits also very well to our FREE live stream tomorrow:
Add it to your calendar: https://www.addevent.com/event/SR26261795

Have fun and let me know what you think about the playground in the comments.

PS: I completely vibe coded this app, so there are still a lot of things missing or not 100% right. Let me know your opinions. If enough people like the playground I'll continue extending it 🙂