We also recently had a workshop with dlt on AI-assisted data ingestion.
Watch the recording and check out the code if you missed it.
Practice what you learned in the homework assignment, and submit it here.
Watch the recording and check out the code if you missed it.
Practice what you learned in the homework assignment, and submit it here.
π5β€3π₯3
Today at 16:00 CET (in 90 minutes) we have office hours
You can use this link if you want to ask any question in advance:
https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
See you soon!
You can use this link if you want to ask any question in advance:
https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
See you soon!
app.sli.do
Join Slido: Enter #code to vote and ask questions
Participate in a live poll, quiz or Q&A. No login required.
β€4
We're starting in 4 minutes:
Stream: https://www.youtube.com/watch?v=o28p3Y2CS9A
Questions: https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
Watch now or later in recording
Stream: https://www.youtube.com/watch?v=o28p3Y2CS9A
Questions: https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
Watch now or later in recording
YouTube
Data Engineering Zoomcamp 2026 - Office Hours (Bruin and Data Platforms)
In this session, Alexey Grigorev is joined by Arsalan from the Bruin team to discuss the evolution of data platforms. We explore how Bruin integrates ingestion, transformation, orchestration, and data quality into a single interface, offering an alternativeβ¦
β€4
This is the link I promised to share:
https://getbruin.com/zoomcamp-project/
You get a chance to win Claude Code subscription if you use Bruin in your project
https://getbruin.com/zoomcamp-project/
You get a chance to win Claude Code subscription if you use Bruin in your project
Bruin
Data Engineering Project Competition | Bruin
Build data pipelines with Bruin and compete for prizes.
π12β€3
The course management platform is currently down
I'm working on restoring it
I'm working on restoring it
β€19π3π₯1
Update: I've spent one hour on a call with AWS and they said they needed more time to restore the database.
Hopefully tomorrow it'll get sorted out
Hopefully tomorrow it'll get sorted out
β€21π10π¨9
The course management platform is back!
If you submitted any homework yesterday, the data may be lost. Please check your homework submissions.
Apologies for the inconvenience!
I'm working on the incident report and will publish it soon. TLDR: don't let your AI agents do
If you submitted any homework yesterday, the data may be lost. Please check your homework submissions.
Apologies for the inconvenience!
I'm working on the incident report and will publish it soon. TLDR: don't let your AI agents do
terraform apply -auto-approveβ€24π€£19π₯3
This week, we're starting Module 6: Batch Processing.
Reminder: the previous homework deadline is in less than 24 hours.
In this module, you'll learn how batch processing works with Spark and PySpark.
You'll cover:
β’ Batch processing fundamentals and Spark basics
β’ Installing and running Spark locally or in Colab
β’ Working with Spark SQL and DataFrames
β’ Handling schemas and processing NYC taxi data
β’ How Spark clusters, joins, and groupBy work internally
β’ Running Spark in the cloud with Dataproc and BigQuery
Homework deadline: 10 March, 12 AM CET
We also recently had a workshop with dlt on AI-assisted data ingestion. Watch the recording and check out the code if you missed it. Practice what you learned in the homework assignment, and submit it here.
Reminder: the previous homework deadline is in less than 24 hours.
In this module, you'll learn how batch processing works with Spark and PySpark.
You'll cover:
β’ Batch processing fundamentals and Spark basics
β’ Installing and running Spark locally or in Colab
β’ Working with Spark SQL and DataFrames
β’ Handling schemas and processing NYC taxi data
β’ How Spark clusters, joins, and groupBy work internally
β’ Running Spark in the cloud with Dataproc and BigQuery
Homework deadline: 10 March, 12 AM CET
We also recently had a workshop with dlt on AI-assisted data ingestion. Watch the recording and check out the code if you missed it. Practice what you learned in the homework assignment, and submit it here.
GitHub
data-engineering-zoomcamp/06-batch at main Β· DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here ππΌ - DataTalksClub/data-engineering-zoomcamp
β€14π2
We uploaded the video about creating projects to YouTube, as some of you reported having problems with watching in on Loom
Here it is: https://www.youtube.com/watch?v=BL0E8xO8OnE
We'll also update it in the repo
And by the way, don't forget about our docs: https://datatalks.club/docs/courses/data-engineering-zoomcamp/
There's a lot of useful information there
Here it is: https://www.youtube.com/watch?v=BL0E8xO8OnE
We'll also update it in the repo
And by the way, don't forget about our docs: https://datatalks.club/docs/courses/data-engineering-zoomcamp/
There's a lot of useful information there
YouTube
DTC Zoomcamp projects
Connect with DataTalks.Club:
- Join the community - https://datatalks.club/slack.html
- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2Fsβ¦
- Join the community - https://datatalks.club/slack.html
- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2Fsβ¦
β€7
We're starting our stream about streaming!
This is going to be a part of module 7 about streaming - and reworked workshop from the last year with the latest versions of PyFlink
Stream: https://www.youtube.com/watch?v=YDUgFeHQzJU
Workshop: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/07-streaming/workshop
Watch now or later in recording!
This is going to be a part of module 7 about streaming - and reworked workshop from the last year with the latest versions of PyFlink
Stream: https://www.youtube.com/watch?v=YDUgFeHQzJU
Workshop: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/07-streaming/workshop
Watch now or later in recording!
YouTube
PyFlink Stream Processing Tutorial: Build a Real-Time Pipeline with Kafka, Redpanda and Python
In this workshop, Alexey Grigorev breaks down the complexities of real-time data engineering, moving from basic Python-based Kafka consumers to enterprise-grade Apache Flink pipelines. This workshop, part of the Data Engineering Zoomcamp, provides a handsβ¦
β€10π2
https://alexeyondata.substack.com/p/how-i-dropped-our-production-database
The post about the database incident last week
The post about the database incident last week
Substack
How I Dropped Our Production Database and Now Pay 10% More for AWS
Iβm working on expanding the AI Shipping Labs website and wanted to migrate its current version from static GitHub Pages to AWS.
π8π€―4π€3π€£3β€1π1πΎ1
We're starting module 7 on stream processing.
Reminder: the previous homework deadline is in less than 24 hours.
The materials were created by Zach Wilson, who ran a Flink stream for the course last year. Alexey recorded an updated Apache Flink workshop to reflect support for Flink 2.x and modern Python versions (3.12, 3.9, 3.8).
It covers:
β’ Streaming fundamentals in Data Engineering Zoomcamp
β’ Kafka/Redpanda, Python producers and consumers
β’ Writing streaming events to PostgreSQL
β’ Apache Flink setup with Docker
β’ Flink jobs for stream processing
β’ Windowing, watermarks, and late events
β’ Real-time aggregations with Flink
Homework deadline: 17 March, 12 AM CET
Learn here: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/07-streaming
Reminder: the previous homework deadline is in less than 24 hours.
The materials were created by Zach Wilson, who ran a Flink stream for the course last year. Alexey recorded an updated Apache Flink workshop to reflect support for Flink 2.x and modern Python versions (3.12, 3.9, 3.8).
It covers:
β’ Streaming fundamentals in Data Engineering Zoomcamp
β’ Kafka/Redpanda, Python producers and consumers
β’ Writing streaming events to PostgreSQL
β’ Apache Flink setup with Docker
β’ Flink jobs for stream processing
β’ Windowing, watermarks, and late events
β’ Real-time aggregations with Flink
Homework deadline: 17 March, 12 AM CET
Learn here: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/07-streaming
GitHub
data-engineering-zoomcamp/07-streaming at main Β· DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here ππΌ - DataTalksClub/data-engineering-zoomcamp
β€16π₯1
I just realized we haven't shared the homework for streaming
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2026/07-streaming/homework.md
The deadline is moved to 20 March
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2026/07-streaming/homework.md
The deadline is moved to 20 March
GitHub
data-engineering-zoomcamp/cohorts/2026/07-streaming/homework.md at main Β· DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here ππΌ - DataTalksClub/data-engineering-zoomcamp
β€19
Itβs time to apply everything from the course.
Build your final project with a complete end-to-end data pipeline.
It takes you through the full workflow:
πΈ Choose a dataset youβre interested in
πΈ Build a pipeline to ingest data into a data lake
πΈ Move the data to a data warehouse
πΈ Transform the data to prepare it for analysis
πΈ Build a dashboard to visualize the results
π₯ Watch the Projects how-to video for the full walkthrough and start building.
π Tip: Use Bruin in your project to participate in the competition and win prizes. Details below.
Build your final project with a complete end-to-end data pipeline.
It takes you through the full workflow:
πΈ Choose a dataset youβre interested in
πΈ Build a pipeline to ingest data into a data lake
πΈ Move the data to a data warehouse
πΈ Transform the data to prepare it for analysis
πΈ Build a dashboard to visualize the results
π₯ Watch the Projects how-to video for the full walkthrough and start building.
π Tip: Use Bruin in your project to participate in the competition and win prizes. Details below.
β€12π6π₯1
Build your project using Bruin for ingestion, transformation, orchestration, and analysis, share it with the community, and compete for prizes.
Prizes:
πΈ Mac Mini for an outstanding project
πΈ 1 year Claude Pro for the top 3 projects
πΈ 1 month Claude Pro for participants
To participate:
πΈ Build your Zoomcamp project using Bruin
πΈ Publish it on GitHub with a README
πΈ Share it in #projects on Slack
Winners will be determined by community votes on Slack.
Learn more here: https://getbruin.com/zoomcamp-project/
Prizes:
πΈ Mac Mini for an outstanding project
πΈ 1 year Claude Pro for the top 3 projects
πΈ 1 month Claude Pro for participants
To participate:
πΈ Build your Zoomcamp project using Bruin
πΈ Publish it on GitHub with a README
πΈ Share it in #projects on Slack
Winners will be determined by community votes on Slack.
Learn more here: https://getbruin.com/zoomcamp-project/
Bruin
Data Engineering Project Competition | Bruin
Build data pipelines with Bruin and compete for prizes.
β€29π₯8π7
A quick update: the Bruin project competition is now open to everyone!
This means your submission does not have to be the same as your Zoomcamp final project.
Even if you are not participating in the course, or if you are using other tech for your Zoomcamp final project, you can still build a separate project with Bruin, submit it to the competition, and compete for prizes.
Prizes:
πΈ Mac Mini for an outstanding project
πΈ 1 year Claude Pro for the top 3 projects
πΈ 1 month Claude Pro for participants
Deadline: Monday, June 1st, 12:00 UTC
More details here: getbruin.com/competition
This means your submission does not have to be the same as your Zoomcamp final project.
Even if you are not participating in the course, or if you are using other tech for your Zoomcamp final project, you can still build a separate project with Bruin, submit it to the competition, and compete for prizes.
Prizes:
πΈ Mac Mini for an outstanding project
πΈ 1 year Claude Pro for the top 3 projects
πΈ 1 month Claude Pro for participants
Deadline: Monday, June 1st, 12:00 UTC
More details here: getbruin.com/competition
Bruin
Data Engineering Project Competition | Bruin
Build data pipelines with Bruin and compete for prizes.
β€26π7π₯5
Great job working on your projects!
Now it's time to learn from your peers
If you submitted your project for attempt, you will find your review assignments here:
https://courses.datatalks.club/de-zoomcamp-2026/project/project1/eval
If not - you still have time for attempt 2. Have fun!
Now it's time to learn from your peers
If you submitted your project for attempt, you will find your review assignments here:
https://courses.datatalks.club/de-zoomcamp-2026/project/project1/eval
If not - you still have time for attempt 2. Have fun!
β€26π5
We have just scored project attempt 1
Congratulations to the 269 course participants who passed it! We will release the certificates for you later along with the second batch
If you haven't passed it, you can improve it and submit one more time
Note that if you did pass the project, submitting it again (even with improvements) is considered self-plagiarism. Please don't do it. But you can submit another project if you want
Also if you haven't made a submission for attempt 1, attempt 2 is still open:
https://courses.datatalks.club/de-zoomcamp-2026/project/project2
Have fun building!
Congratulations to the 269 course participants who passed it! We will release the certificates for you later along with the second batch
If you haven't passed it, you can improve it and submit one more time
Note that if you did pass the project, submitting it again (even with improvements) is considered self-plagiarism. Please don't do it. But you can submit another project if you want
Also if you haven't made a submission for attempt 1, attempt 2 is still open:
https://courses.datatalks.club/de-zoomcamp-2026/project/project2
Have fun building!
β€33π₯1
The form for submitting the second attempt is now closed.
Now time to learn from your peers!
If you submitted a project for attempt 2, you'll find the project review assignments here:
https://courses.datatalks.club/de-zoomcamp-2026/project/project2/eval
Have fun!
Now time to learn from your peers!
If you submitted a project for attempt 2, you'll find the project review assignments here:
https://courses.datatalks.club/de-zoomcamp-2026/project/project2/eval
Have fun!
β€22