kestra-io/kestra
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Language:Java
Total stars: 5704
Stars trend:
#java
#data, #dataengineering, #dataintegration, #dataorchestration, #dataorchestrator, #datapipeline, #dataquality, #elt, #etl, #lowcode, #orchestration, #pipeline, #reverseetl, #scheduler, #workflow, #workflowengine
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Language:Java
Total stars: 5704
Stars trend:
26 Mar 2024
8am ▏ +1
9am +0
10am +0
11am +0
12pm +0
1pm +0
2pm ▏ +1
3pm ███▏ +25
4pm ██▏ +17
5pm ██▎ +18
6pm █▋ +13
#java
#data, #dataengineering, #dataintegration, #dataorchestration, #dataorchestrator, #datapipeline, #dataquality, #elt, #etl, #lowcode, #orchestration, #pipeline, #reverseetl, #scheduler, #workflow, #workflowengine
Nike-Inc/koheesio
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
Language:Python
Total stars: 162
Stars trend:
#python
#dataengineering, #deltalake, #pydantic, #pyspark, #python
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
Language:Python
Total stars: 162
Stars trend:
3 Jun 2024
10pm ▏ +1
11pm +0
4 Jun 2024
12am +0
1am +0
2am ▏ +1
3am +0
4am +0
5am ▋ +5
6am ████▋ +37
7am ██▌ +20
8am ██▉ +23
#python
#dataengineering, #deltalake, #pydantic, #pyspark, #python
eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Language:
Total stars: 26245
Stars trend:
#applieddatascience, #appliedmachinelearning, #computervision, #datadiscovery, #dataengineering, #dataquality, #datascience, #deeplearning, #machinelearning, #naturallanguageprocessing, #production, #recsys, #reinforcementlearning, #search
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Language:
Total stars: 26245
Stars trend:
18 Jun 2024
6pm ▍ +3
7pm █▌ +12
8pm █▍ +11
9pm █▎ +10
10pm ▌ +4
11pm █▏ +9
19 Jun 2024
12am ▋ +5
1am ▊ +6
2am ▋ +5
3am ▊ +6
4am █ +8
#applieddatascience, #appliedmachinelearning, #computervision, #datadiscovery, #dataengineering, #dataquality, #datascience, #deeplearning, #machinelearning, #naturallanguageprocessing, #production, #recsys, #reinforcementlearning, #search
GokuMohandas/Made-With-ML
Learn how to design, develop, deploy and iterate on production-grade ML applications.
Language:Jupyter Notebook
Total stars: 36368
Stars trend:
#jupyternotebook
#dataengineering, #dataquality, #datascience, #deeplearning, #distributedml, #distributedtraining, #llms, #machinelearning, #mlops, #naturallanguageprocessing, #python, #pytorch, #ray
Learn how to design, develop, deploy and iterate on production-grade ML applications.
Language:Jupyter Notebook
Total stars: 36368
Stars trend:
19 Jun 2024
1am ▌ +4
2am ▍ +3
3am ▎ +2
4am ▉ +7
5am ▉ +7
6am ▉ +7
7am █▍ +11
8am █▎ +10
9am ▋ +5
10am ▋ +5
11am █▎ +10
12pm ▊ +6
#jupyternotebook
#dataengineering, #dataquality, #datascience, #deeplearning, #distributedml, #distributedtraining, #llms, #machinelearning, #mlops, #naturallanguageprocessing, #python, #pytorch, #ray
open-metadata/OpenMetadata
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
Language:TypeScript
Total stars: 4556
Stars trend:
#typescript
#datacatalog, #datacollaboration, #datacontracts, #datadiscovery, #datagovernance, #datalineage, #dataobservability, #dataprofiling, #dataquality, #dataqualitychecks, #datascience, #datavalidation, #datacatalog, #datadiscovery, #dataengineering, #dataquality, #dbt, #metadata, #metadatamanagement, #snowflake
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
Language:TypeScript
Total stars: 4556
Stars trend:
19 Jun 2024
11am ▌ +4
12pm ▋ +5
1pm █▍ +11
2pm █ +8
3pm █ +8
4pm ▉ +7
5pm ▉ +7
6pm ▎ +2
7pm ▋ +5
8pm ▉ +7
9pm ▋ +5
10pm ▊ +6
#typescript
#datacatalog, #datacollaboration, #datacontracts, #datadiscovery, #datagovernance, #datalineage, #dataobservability, #dataprofiling, #dataquality, #dataqualitychecks, #datascience, #datavalidation, #datacatalog, #datadiscovery, #dataengineering, #dataquality, #dbt, #metadata, #metadatamanagement, #snowflake
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Language:Python
Total stars: 16541
Stars trend:
#python
#automation, #data, #dataengineering, #dataops, #datascience, #infrastructure, #mlops, #observability, #orchestration, #pipeline, #prefect, #python, #workflow, #workflowengine
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Language:Python
Total stars: 16541
Stars trend:
9 Nov 2024
4am ▎ +2
5am ▏ +1
6am ▍ +3
7am ▏ +1
8am ▎ +2
9am ▏ +1
10am ▍ +3
11am ██▉ +23
12pm ███ +24
1pm ████▍ +35
#python
#automation, #data, #dataengineering, #dataops, #datascience, #infrastructure, #mlops, #observability, #orchestration, #pipeline, #prefect, #python, #workflow, #workflowengine
apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Language:Python
Total stars: 37799
Stars trend:
#python
#airflow, #apache, #apacheairflow, #automation, #dag, #dataengineering, #dataintegration, #dataorchestrator, #datapipelines, #datascience, #elt, #etl, #machinelearning, #mlops, #orchestration, #python, #scheduler, #workflow, #workflowengine, #workfloworchestration
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Language:Python
Total stars: 37799
Stars trend:
19 Dec 2024
8pm ▉ +7
9pm ▉ +7
10pm ▋ +5
11pm ▉ +7
20 Dec 2024
12am ▏ +1
1am █▏ +9
2am ▉ +7
3am ▍ +3
4am █▎ +10
5am ▉ +7
6am ▊ +6
7am █▎ +10
#python
#airflow, #apache, #apacheairflow, #automation, #dag, #dataengineering, #dataintegration, #dataorchestrator, #datapipelines, #datascience, #elt, #etl, #machinelearning, #mlops, #orchestration, #python, #scheduler, #workflow, #workflowengine, #workfloworchestration
DataTalksClub/data-engineering-zoomcamp
Free Data Engineering course!
Language:Jupyter Notebook
Total stars: 27135
Stars trend:
#jupyternotebook
#dataengineering, #dbt, #docker, #kafka, #kestra, #spark
Free Data Engineering course!
Language:Jupyter Notebook
Total stars: 27135
Stars trend:
13 Jan 2025
5am ▏ +3
6am +2
7am ▎ +5
8am +1
9am ▊ +15
10am ▏ +3
11am ▏ +3
12pm ▍ +7
1pm ▍ +9
2pm ▎ +5
3pm ▍ +8
4pm ██████████████████▊ +342
#jupyternotebook
#dataengineering, #dbt, #docker, #kafka, #kestra, #spark
pyper-dev/pyper
Concurrent Python made simple
Language:Python
Total stars: 136
Stars trend:
#python
#asyncio, #concurrency, #data, #datacollection, #dataengineering, #datapipelines, #dataprocessing, #multiprocessing, #parallelcomputing, #python, #threading
Concurrent Python made simple
Language:Python
Total stars: 136
Stars trend:
15 Jan 2025
2am █▌ +12
3am ██▊ +22
4am ██▊ +22
5am ██▋ +21
6am ██▏ +17
#python
#asyncio, #concurrency, #data, #datacollection, #dataengineering, #datapipelines, #dataprocessing, #multiprocessing, #parallelcomputing, #python, #threading
DataExpert-io/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
Language:Jupyter Notebook
Total stars: 24911
Stars trend:
#jupyternotebook
#apachespark, #awesome, #bigdata, #data, #dataengineering, #sql
This is a repo with links to everything you'd ever want to learn about data engineering
Language:Jupyter Notebook
Total stars: 24911
Stars trend:
22 Jan 2025
9am ▏ +1
10am ▊ +6
11am ▉ +7
12pm ▊ +6
1pm █▏ +9
2pm █▍ +11
3pm █ +8
4pm █▏ +9
5pm █ +8
6pm ▍ +3
7pm █ +8
#jupyternotebook
#apachespark, #awesome, #bigdata, #data, #dataengineering, #sql
cocoindex-io/cocoindex
ETL framework to turn your data AI-ready - with realtime incremental updates and support custom logic like lego.
Language:Rust
Total stars: 672
Stars trend:
#rust
#ai, #changedatacapture, #data, #dataengineering, #dataindexing, #datainfrastructure, #dataprocessing, #dataflow, #etl, #helpwanted, #indexing, #knowledgegraph, #llm, #pipeline, #python, #rag, #realtime, #rust, #semanticsearch, #streaming
ETL framework to turn your data AI-ready - with realtime incremental updates and support custom logic like lego.
Language:Rust
Total stars: 672
Stars trend:
20 Apr 2025
3pm █▉ +15
4pm ▍ +3
5pm ▋ +5
6pm ▉ +7
7pm ▋ +5
8pm ▋ +5
9pm ▊ +6
10pm ▊ +6
11pm █ +8
21 Apr 2025
12am ▍ +3
1am ▉ +7
2am ▉ +7
#rust
#ai, #changedatacapture, #data, #dataengineering, #dataindexing, #datainfrastructure, #dataprocessing, #dataflow, #etl, #helpwanted, #indexing, #knowledgegraph, #llm, #pipeline, #python, #rag, #realtime, #rust, #semanticsearch, #streaming
cocoindex-io/cocoindex
Real-time data transformation framework for AI. Ultra performant, with incremental processing.
Language:Rust
Total stars: 1459
Stars trend:
#rust
#ai, #changedatacapture, #data, #dataengineering, #dataindexing, #datainfrastructure, #dataprocessing, #dataflow, #etl, #helpwanted, #indexing, #knowledgegraph, #llm, #pipeline, #python, #rag, #realtime, #rust, #semanticsearch, #streaming
Real-time data transformation framework for AI. Ultra performant, with incremental processing.
Language:Rust
Total stars: 1459
Stars trend:
20 May 2025
1am ▊ +6
2am ▍ +3
3am ▊ +6
4am █▎ +10
5am █▎ +10
6am ▋ +5
7am █▎ +10
8am ▎ +2
9am ▌ +4
10am █▎ +10
11am ▉ +7
12pm █ +8
#rust
#ai, #changedatacapture, #data, #dataengineering, #dataindexing, #datainfrastructure, #dataprocessing, #dataflow, #etl, #helpwanted, #indexing, #knowledgegraph, #llm, #pipeline, #python, #rag, #realtime, #rust, #semanticsearch, #streaming
DataExpert-io/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
Language:Jupyter Notebook
Total stars: 28322
Stars trend:
#jupyternotebook
#apachespark, #awesome, #bigdata, #data, #dataengineering, #sql
This is a repo with links to everything you'd ever want to learn about data engineering
Language:Jupyter Notebook
Total stars: 28322
Stars trend:
1 Jun 2025
6pm ▏ +1
7pm ▏ +1
8pm ▏ +1
9pm +0
10pm +0
11pm ▊ +6
2 Jun 2025
12am ████ +32
1am █▋ +13
2am █▉ +15
3am █▋ +13
#jupyternotebook
#apachespark, #awesome, #bigdata, #data, #dataengineering, #sql