Data Engineering Zoomcamp
30K subscribers
6 photos
160 links
Download Telegram
Hi everyone!

Our stream will start today at 17:00 CET. That's in 4 hours 45 minutes

We will have Q&A session today, and you can already ask your questions here

https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG

We will share the link to the YouTube stream 5-10 minutes before the start
58👍17🔥8
Our slack invite link has expired

If you're trying to get into Slack, you can use this one:

http://join.datatalks.club
👍227
Module 2 is starting: Workflow Orchestration with Kestra

This week in Data Engineering Zoomcamp, we move into workflow orchestration using Kestra, an open-source, event-driven orchestrator.

Materials:

🔸 Videos
🔸 Quickstart and resources
🔸 Homework and submission form

If you run into setup issues, check the troubleshooting section in the Module 2 README and ask questions in Slack.
👍1411👏3
HW1 deadline is less than one day away

🔸 680 homework submissions
🔸 1,042 registered on the platform
🔸 25,000+ signed up for the Zoomcamp overall

If you’re registered and haven’t submitted yet, this is your nudge. Start small, submit something, and iterate.
34😱23👍13😭9😁1
It looks like there was an outage yesterday that also affected our course management platform some time around 18:00-19:00 CET

If you had problems submitting your homework and still haven't submitted it, you can do it today
25👍12🙏9
14👏4
We scored homework 1

You can see the leaderboard here:

https://courses.datatalks.club/de-zoomcamp-2026/leaderboard

Great job everyone!
24👏7
On Tuesday (Feb 3) we will have office hours

Time: 17:00 CET
Place: our YouTube channel

See you soon!
🔥175
We start Module 3 on Data Warehousing and BigQuery

Reminder: the previous homework deadline is in less than 24 hours.

This module covers:

🔸 Data warehouse fundamentals and BigQuery basics
🔸 Query performance and internals (partitioning, clustering, best practices)
🔸 Machine learning in BigQuery, including SQL for ML
🔸 Deploying models from BigQuery using Docker

Learn here: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/03-data-warehouse

We're finalizing Homework 3 and will notify you when it's ready.
23😁2👍1👎1
Reminder: today at 17:00 CET (in 10 hours from now) we have a live stream

We will share the link 5-10 minutes before the start

See you soon!
20🔥3
We start Module 4 on Analytics Engineering (dbt)

Reminder: the previous homework deadline is in less than 24 hours.

This module focuses on transforming warehouse data into analytical models using dbt.

You’ll build a dbt project on NYC yellow and green taxi data (2019-2020) and cover analytics engineering fundamentals, data modeling, and dbt in practice.

Setup options:

• Local: DuckDB + dbt Core (free, no prerequisites)
• Cloud: BigQuery + dbt Cloud (requires Module 3)

👉🏼 Materials

Homework 4:
Homework assignment
Submit here

Deadlone for this homework: 17 February 12 AM CET

Big thanks to Juan Manuel Perafan for updating this module, originally recorded by Victoria Perez Mola!
19👍5
We received a few reported cases of people misusing the "Learning in Public" scoring. These course participants were submitting links that were not social media posts sharing their progress, but rather random links they found on the internet while researching topics.

This is against the rules. As a consequence, these two participants can no longer earn points for "Learning in Public." I still want to encourage them to continue the practice because it is for their own benefit, but we are against misusing the system just for the sake of getting points. That is why the "Learning in Public" rewards have been disabled for them.

If you come across anyone on the leaderboard who is misusing this system by posting links that are not their personal posts about what they are learning in the course, please report it in the course channel.
👍565
This week, we're starting Module 5: Data Platforms.

Reminder: the previous homework deadline is in less than 24 hours

In this module, you'll learn how modern data platforms manage the full data lifecycle, from ingestion to transformation, orchestration, data quality, and metadata.

We'll use Bruin as a practical example of a unified data platform. You'll install it, create a project, and build an end-to-end pipeline using NYC taxi data with a three-layer architecture: ingestion, staging, and reporting.

You'll also set up connections, configure pipelines in YAML, run Python and SQL assets, and see how orchestration and data quality fit into a single workflow.

Homework:

Homework 5
• Deadline: 1st March, 12:00 AM CET

🎁 Bonus: Bruin is offering a Claude Pro subscription to participants who complete their final project using Bruin.
👍199🔥7🙈1
Reminder: tomorrow we’re running a hands-on workshop on AI-Assisted Data Ingestion with dlt as part of the Data Engineering Zoomcamp.

Date: Tuesday, February 17
Time: 4:30 PM CET
Place: Live on YouTube

Join Aashish Nair, who’ll lead the session, to build a reliable ingestion pipeline into a data warehouse (for example, Snowflake) using dlt from dltHub.

We’ll go through:

• Extracting data from APIs, files, and databases
• Normalizing it into consistent schemas
• Writing it to a warehouse
• Using LLMs to accelerate pipeline development
• Validating data and schema changes with the dlt dashboard and dlt MCP

By the end, you’ll understand how to design maintainable ingestion pipelines and use AI and validation tools to build them faster.

Register to get the join link:
https://luma.com/hzis1yzp
👍117🔥7👎1