Data Engineering Zoomcamp
30.1K subscribers
6 photos
161 links
Download Telegram
We also recently had a workshop with dlt on AI-assisted data ingestion.

Watch the recording and check out the code if you missed it.

Practice what you learned in the homework assignment, and submit it here.
πŸ‘5❀3πŸ”₯3
Today at 16:00 CET (in 90 minutes) we have office hours

You can use this link if you want to ask any question in advance:

https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG

See you soon!
❀4
This is the link I promised to share:

https://getbruin.com/zoomcamp-project/

You get a chance to win Claude Code subscription if you use Bruin in your project
πŸ‘12❀3
The course management platform is currently down

I'm working on restoring it
❀19πŸ™3πŸ”₯1
Update: I've spent one hour on a call with AWS and they said they needed more time to restore the database.

Hopefully tomorrow it'll get sorted out
❀21πŸ‘10😨9
πŸ”₯8
The course management platform is back!

If you submitted any homework yesterday, the data may be lost. Please check your homework submissions.

Apologies for the inconvenience!

I'm working on the incident report and will publish it soon. TLDR: don't let your AI agents do terraform apply -auto-approve
❀24🀣19πŸ”₯3
This week, we're starting Module 6: Batch Processing.

Reminder: the previous homework deadline is in less than 24 hours.

In this module, you'll learn how batch processing works with Spark and PySpark.

You'll cover:

β€’ Batch processing fundamentals and Spark basics
β€’ Installing and running Spark locally or in Colab
β€’ Working with Spark SQL and DataFrames
β€’ Handling schemas and processing NYC taxi data
β€’ How Spark clusters, joins, and groupBy work internally
β€’ Running Spark in the cloud with Dataproc and BigQuery

Homework deadline: 10 March, 12 AM CET

We also recently had a workshop with dlt on AI-assisted data ingestion. Watch the recording and check out the code if you missed it. Practice what you learned in the homework assignment, and submit it here.
❀14πŸ‘2
We're starting module 7 on stream processing.

Reminder: the previous homework deadline is in less than 24 hours.

The materials were created by Zach Wilson, who ran a Flink stream for the course last year. Alexey recorded an updated Apache Flink workshop to reflect support for Flink 2.x and modern Python versions (3.12, 3.9, 3.8).

It covers:

β€’ Streaming fundamentals in Data Engineering Zoomcamp
β€’ Kafka/Redpanda, Python producers and consumers
β€’ Writing streaming events to PostgreSQL
β€’ Apache Flink setup with Docker
β€’ Flink jobs for stream processing
β€’ Windowing, watermarks, and late events
β€’ Real-time aggregations with Flink

Homework deadline: 17 March, 12 AM CET

Learn here: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/07-streaming
❀16πŸ”₯1
It’s time to apply everything from the course.

Build your final project with a complete end-to-end data pipeline.

It takes you through the full workflow:

πŸ”Έ Choose a dataset you’re interested in
πŸ”Έ Build a pipeline to ingest data into a data lake
πŸ”Έ Move the data to a data warehouse
πŸ”Έ Transform the data to prepare it for analysis
πŸ”Έ Build a dashboard to visualize the results

πŸŽ₯ Watch the Projects how-to video for the full walkthrough and start building.

πŸ† Tip: Use Bruin in your project to participate in the competition and win prizes. Details below.
❀12πŸ‘6πŸ”₯1
Build your project using Bruin for ingestion, transformation, orchestration, and analysis, share it with the community, and compete for prizes.

Prizes:

πŸ”Έ Mac Mini for an outstanding project
πŸ”Έ 1 year Claude Pro for the top 3 projects
πŸ”Έ 1 month Claude Pro for participants

To participate:

πŸ”Έ Build your Zoomcamp project using Bruin
πŸ”Έ Publish it on GitHub with a README
πŸ”Έ Share it in #projects on Slack

Winners will be determined by community votes on Slack.

Learn more here: https://getbruin.com/zoomcamp-project/
❀29πŸ”₯8πŸ‘7
A quick update: the Bruin project competition is now open to everyone!

This means your submission does not have to be the same as your Zoomcamp final project.

Even if you are not participating in the course, or if you are using other tech for your Zoomcamp final project, you can still build a separate project with Bruin, submit it to the competition, and compete for prizes.

Prizes:

πŸ”Έ Mac Mini for an outstanding project
πŸ”Έ 1 year Claude Pro for the top 3 projects
πŸ”Έ 1 month Claude Pro for participants

Deadline: Monday, June 1st, 12:00 UTC

More details here: getbruin.com/competition
❀26πŸ‘7πŸ”₯5
Great job working on your projects!

Now it's time to learn from your peers

If you submitted your project for attempt, you will find your review assignments here:

https://courses.datatalks.club/de-zoomcamp-2026/project/project1/eval

If not - you still have time for attempt 2. Have fun!
❀26πŸ‘5
We have just scored project attempt 1

Congratulations to the 269 course participants who passed it! We will release the certificates for you later along with the second batch

If you haven't passed it, you can improve it and submit one more time

Note that if you did pass the project, submitting it again (even with improvements) is considered self-plagiarism. Please don't do it. But you can submit another project if you want

Also if you haven't made a submission for attempt 1, attempt 2 is still open:

https://courses.datatalks.club/de-zoomcamp-2026/project/project2

Have fun building!
❀33πŸ”₯1
The form for submitting the second attempt is now closed.

Now time to learn from your peers!

If you submitted a project for attempt 2, you'll find the project review assignments here:

https://courses.datatalks.club/de-zoomcamp-2026/project/project2/eval

Have fun!
❀22