Hi everyone!
Our stream will start today at 17:00 CET. That's in 4 hours 45 minutes
We will have Q&A session today, and you can already ask your questions here
https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
We will share the link to the YouTube stream 5-10 minutes before the start
Our stream will start today at 17:00 CET. That's in 4 hours 45 minutes
We will have Q&A session today, and you can already ask your questions here
https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
We will share the link to the YouTube stream 5-10 minutes before the start
app.sli.do
Join Slido: Enter #code to vote and ask questions
Participate in a live poll, quiz or Q&A. No login required.
❤58👍17🔥8
We're starting in 5 minutes!
Stream: https://www.youtube.com/watch?v=JgspdlKXS-w
Questions: https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
Join now or watch later in recording!
Stream: https://www.youtube.com/watch?v=JgspdlKXS-w
Questions: https://app.sli.do/event/r22cR71X67cQ7gHacVu7LG
Join now or watch later in recording!
YouTube
Data Engineering Zoomcamp 2026 Launch Stream
In this talk, Alexey Grigorev, Founder of Data Talks Club, kicks off the fifth anniversary edition of the Data Engineering Zoomcamp 2026. We explore the fundamental roadmap for becoming a professional data engineer from mastering containerization with Docker…
❤32👍10👏1
The website with documentation
https://datatalks.club/docs/
Everything is on GitHub, so if you want to help us improve the docs, we welcome contributions!
https://github.com/DataTalksClub/docs
https://datatalks.club/docs/
Everything is on GitHub, so if you want to help us improve the docs, we welcome contributions!
https://github.com/DataTalksClub/docs
DataTalks.Club Zoomcamps Notes and Resources
Home
A collection of notes and resources for the DataTalks.Club Zoomcamps, our free courses.
🔥17❤11
Our slack invite link has expired
If you're trying to get into Slack, you can use this one:
http://join.datatalks.club
If you're trying to get into Slack, you can use this one:
http://join.datatalks.club
👍22❤7
Module 2 is starting: Workflow Orchestration with Kestra
This week in Data Engineering Zoomcamp, we move into workflow orchestration using Kestra, an open-source, event-driven orchestrator.
Materials:
🔸 Videos
🔸 Quickstart and resources
🔸 Homework and submission form
If you run into setup issues, check the troubleshooting section in the Module 2 README and ask questions in Slack.
This week in Data Engineering Zoomcamp, we move into workflow orchestration using Kestra, an open-source, event-driven orchestrator.
Materials:
🔸 Videos
🔸 Quickstart and resources
🔸 Homework and submission form
If you run into setup issues, check the troubleshooting section in the Module 2 README and ask questions in Slack.
👍14❤11👏3
⏰ HW1 deadline is less than one day away
🔸 680 homework submissions
🔸 1,042 registered on the platform
🔸 25,000+ signed up for the Zoomcamp overall
If you’re registered and haven’t submitted yet, this is your nudge. Start small, submit something, and iterate.
🔸 680 homework submissions
🔸 1,042 registered on the platform
🔸 25,000+ signed up for the Zoomcamp overall
If you’re registered and haven’t submitted yet, this is your nudge. Start small, submit something, and iterate.
❤34😱23👍13😭9😁1
It looks like there was an outage yesterday that also affected our course management platform some time around 18:00-19:00 CET
If you had problems submitting your homework and still haven't submitted it, you can do it today
If you had problems submitting your homework and still haven't submitted it, you can do it today
❤25👍12🙏9
We scored homework 1
You can see the leaderboard here:
https://courses.datatalks.club/de-zoomcamp-2026/leaderboard
Great job everyone!
You can see the leaderboard here:
https://courses.datatalks.club/de-zoomcamp-2026/leaderboard
Great job everyone!
❤24👏7
On Tuesday (Feb 3) we will have office hours
Time: 17:00 CET
Place: our YouTube channel
See you soon!
Time: 17:00 CET
Place: our YouTube channel
See you soon!
🔥17❤5
Also as a part of the course we will have a workshop with dlt (data load tool)
If you want to make sure you don't miss it, you can sign up here
https://luma.com/hzis1yzp
If you want to make sure you don't miss it, you can sign up here
https://luma.com/hzis1yzp
Luma
From APIs to Warehouses: AI-Assisted Data Ingestion with dlt · Luma
This hands-on workshop focuses on building reliable data ingestion pipelines to data warehouses (for example, Snowflake) using dlt (data load tool), enhanced…
❤22👍8
We start Module 3 on Data Warehousing and BigQuery
Reminder: the previous homework deadline is in less than 24 hours.
This module covers:
🔸 Data warehouse fundamentals and BigQuery basics
🔸 Query performance and internals (partitioning, clustering, best practices)
🔸 Machine learning in BigQuery, including SQL for ML
🔸 Deploying models from BigQuery using Docker
Learn here: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/03-data-warehouse
We're finalizing Homework 3 and will notify you when it's ready.
Reminder: the previous homework deadline is in less than 24 hours.
This module covers:
🔸 Data warehouse fundamentals and BigQuery basics
🔸 Query performance and internals (partitioning, clustering, best practices)
🔸 Machine learning in BigQuery, including SQL for ML
🔸 Deploying models from BigQuery using Docker
Learn here: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/03-data-warehouse
We're finalizing Homework 3 and will notify you when it's ready.
GitHub
data-engineering-zoomcamp/03-data-warehouse at main · DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼 - DataTalksClub/data-engineering-zoomcamp
❤23😁2👍1👎1
Reminder: today at 17:00 CET (in 10 hours from now) we have a live stream
We will share the link 5-10 minutes before the start
See you soon!
We will share the link 5-10 minutes before the start
See you soon!
❤20🔥3
homework 3:
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2026/03-data-warehouse/homework.md
have fun!
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2026/03-data-warehouse/homework.md
have fun!
GitHub
data-engineering-zoomcamp/cohorts/2026/03-data-warehouse/homework.md at main · DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼 - DataTalksClub/data-engineering-zoomcamp
❤17👎1
Yesterday I (Alexey) answered some of your questions that we left unanswered during the launch stream
https://www.youtube.com/watch?v=C-akHAp3XM0
https://www.youtube.com/watch?v=C-akHAp3XM0
YouTube
Data Engineering Zoomcamp 2026 - Q&A Bonus Session
In this Data Engineering Zoomcamp Q&A Bonus Session, Alexey Grigorev (Founder of DataTalks.Club) dives into a Slido backlog of over 300 community questions. He shares expert insights on the logistics of the 2026 bootcamp, the financial sustainability of free…
❤16
We start Module 4 on Analytics Engineering (dbt)
Reminder: the previous homework deadline is in less than 24 hours.
This module focuses on transforming warehouse data into analytical models using dbt.
You’ll build a dbt project on NYC yellow and green taxi data (2019-2020) and cover analytics engineering fundamentals, data modeling, and dbt in practice.
Setup options:
• Local: DuckDB + dbt Core (free, no prerequisites)
• Cloud: BigQuery + dbt Cloud (requires Module 3)
👉🏼 Materials
Homework 4:
• Homework assignment
• Submit here
Deadlone for this homework: 17 February 12 AM CET
Big thanks to Juan Manuel Perafan for updating this module, originally recorded by Victoria Perez Mola!
Reminder: the previous homework deadline is in less than 24 hours.
This module focuses on transforming warehouse data into analytical models using dbt.
You’ll build a dbt project on NYC yellow and green taxi data (2019-2020) and cover analytics engineering fundamentals, data modeling, and dbt in practice.
Setup options:
• Local: DuckDB + dbt Core (free, no prerequisites)
• Cloud: BigQuery + dbt Cloud (requires Module 3)
👉🏼 Materials
Homework 4:
• Homework assignment
• Submit here
Deadlone for this homework: 17 February 12 AM CET
Big thanks to Juan Manuel Perafan for updating this module, originally recorded by Victoria Perez Mola!
❤19👍5
We received a few reported cases of people misusing the "Learning in Public" scoring. These course participants were submitting links that were not social media posts sharing their progress, but rather random links they found on the internet while researching topics.
This is against the rules. As a consequence, these two participants can no longer earn points for "Learning in Public." I still want to encourage them to continue the practice because it is for their own benefit, but we are against misusing the system just for the sake of getting points. That is why the "Learning in Public" rewards have been disabled for them.
If you come across anyone on the leaderboard who is misusing this system by posting links that are not their personal posts about what they are learning in the course, please report it in the course channel.
This is against the rules. As a consequence, these two participants can no longer earn points for "Learning in Public." I still want to encourage them to continue the practice because it is for their own benefit, but we are against misusing the system just for the sake of getting points. That is why the "Learning in Public" rewards have been disabled for them.
If you come across anyone on the leaderboard who is misusing this system by posting links that are not their personal posts about what they are learning in the course, please report it in the course channel.
👍56❤5
This week, we're starting Module 5: Data Platforms.
Reminder: the previous homework deadline is in less than 24 hours
In this module, you'll learn how modern data platforms manage the full data lifecycle, from ingestion to transformation, orchestration, data quality, and metadata.
We'll use Bruin as a practical example of a unified data platform. You'll install it, create a project, and build an end-to-end pipeline using NYC taxi data with a three-layer architecture: ingestion, staging, and reporting.
You'll also set up connections, configure pipelines in YAML, run Python and SQL assets, and see how orchestration and data quality fit into a single workflow.
Homework:
• Homework 5
• Deadline: 1st March, 12:00 AM CET
🎁 Bonus: Bruin is offering a Claude Pro subscription to participants who complete their final project using Bruin.
Reminder: the previous homework deadline is in less than 24 hours
In this module, you'll learn how modern data platforms manage the full data lifecycle, from ingestion to transformation, orchestration, data quality, and metadata.
We'll use Bruin as a practical example of a unified data platform. You'll install it, create a project, and build an end-to-end pipeline using NYC taxi data with a three-layer architecture: ingestion, staging, and reporting.
You'll also set up connections, configure pipelines in YAML, run Python and SQL assets, and see how orchestration and data quality fit into a single workflow.
Homework:
• Homework 5
• Deadline: 1st March, 12:00 AM CET
🎁 Bonus: Bruin is offering a Claude Pro subscription to participants who complete their final project using Bruin.
👍19❤9🔥7🙈1
Reminder: tomorrow we’re running a hands-on workshop on AI-Assisted Data Ingestion with dlt as part of the Data Engineering Zoomcamp.
Date: Tuesday, February 17
Time: 4:30 PM CET
Place: Live on YouTube
Join Aashish Nair, who’ll lead the session, to build a reliable ingestion pipeline into a data warehouse (for example, Snowflake) using dlt from dltHub.
We’ll go through:
• Extracting data from APIs, files, and databases
• Normalizing it into consistent schemas
• Writing it to a warehouse
• Using LLMs to accelerate pipeline development
• Validating data and schema changes with the dlt dashboard and dlt MCP
By the end, you’ll understand how to design maintainable ingestion pipelines and use AI and validation tools to build them faster.
Register to get the join link:
https://luma.com/hzis1yzp
Date: Tuesday, February 17
Time: 4:30 PM CET
Place: Live on YouTube
Join Aashish Nair, who’ll lead the session, to build a reliable ingestion pipeline into a data warehouse (for example, Snowflake) using dlt from dltHub.
We’ll go through:
• Extracting data from APIs, files, and databases
• Normalizing it into consistent schemas
• Writing it to a warehouse
• Using LLMs to accelerate pipeline development
• Validating data and schema changes with the dlt dashboard and dlt MCP
By the end, you’ll understand how to design maintainable ingestion pipelines and use AI and validation tools to build them faster.
Register to get the join link:
https://luma.com/hzis1yzp
Luma
From APIs to Warehouses: AI-Assisted Data Ingestion with dlt · Luma
This hands-on workshop focuses on building reliable data ingestion pipelines to data warehouses (for example, Snowflake) using dlt (data load tool), enhanced…
👍11❤7🔥7👎1
We're starting the workshop!
Stream: https://www.youtube.com/watch?v=5eMytPBgmVs
Code: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2026/workshops/dlt.md
watch now or later in recording
Stream: https://www.youtube.com/watch?v=5eMytPBgmVs
Code: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2026/workshops/dlt.md
watch now or later in recording
YouTube
From APIs to Warehouses: AI-Assisted Data Ingestion with dlt - Aashish Nair
In this workshop, Aashish from dltHub demonstrates a new standard for data engineering. We move away from manual scripting and toward an AI-assisted workflow that uses the open-source dlt (data load tool) library to build robust, self-healing pipelines in…
❤8👍7🙉1