#python #airflow #airflow_operators #aws #aws_ec2 #aws_s3 #aws_sdk #cassandra #cassandra_database #cloudformation #cluster #data #data_engineering #data_engineering_pipeline #data_lake #data_modeling #data_warehouse #etl_pipeline #infrastructure #postgres #postgresql_database
https://github.com/san089/Udacity-Data-Engineering-Projects
https://github.com/san089/Udacity-Data-Engineering-Projects
GitHub
GitHub - san089/Udacity-Data-Engineering-Projects: Few projects related to Data Engineering including Data Modeling, Infrastructure…
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development. - san089/Udacity-Data-Engineering-Projects
#other #applied_data_science #applied_machine_learning #data_engineering #data_science #machine_learning #nlp #papers #production #recommendation #recsys #reinforcement_learning #search
https://github.com/eugeneyan/applied-ml
https://github.com/eugeneyan/applied-ml
GitHub
GitHub - eugeneyan/applied-ml: 📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production. - eugeneyan/applied-ml
#python #cleandata #data_engineering #data_profilers #data_profiling #data_quality #data_science #data_unit_tests #datacleaner #datacleaning #dataquality #dataunittest #eda #exploratory_analysis #exploratory_data_analysis #exploratorydataanalysis #mlops #pipeline #pipeline_debt #pipeline_testing #pipeline_tests
https://github.com/great-expectations/great_expectations
https://github.com/great-expectations/great_expectations
GitHub
GitHub - great-expectations/great_expectations: Always know what to expect from your data.
Always know what to expect from your data. Contribute to great-expectations/great_expectations development by creating an account on GitHub.
#typescript #ab_testing #abtest #abtesting #analytics #bigquery #clickhouse #continuous_delivery #data_analysis #data_engineering #data_science #experimentation #feature_flagging #feature_flags #mixpanel #redshift #remote_config #snowflake #split_testing #statistics
https://github.com/growthbook/growthbook
https://github.com/growthbook/growthbook
GitHub
GitHub - growthbook/growthbook: Open Source Feature Flagging and A/B Testing Platform
Open Source Feature Flagging and A/B Testing Platform - growthbook/growthbook
#python #data_engineering #data_science #jupyter #jupyter_notebooks #machine_learning #mlops #papermill #pipelines #pycharm #vscode #workflow
https://github.com/ploomber/ploomber
https://github.com/ploomber/ploomber
GitHub
GitHub - ploomber/ploomber: The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️ - ploomber/ploomber
#scala #data_engineering #data_science #deep_learning #feature_engineering #feature_extraction #kubernetes #machine_learning #personalization #ranking #search
https://github.com/metarank/metarank
https://github.com/metarank/metarank
GitHub
GitHub - metarank/metarank: A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations…
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine - metarank/metarank
#java #data #data_engineering #data_orchestration #data_orchestrator #data_pipeline #dataflow #elt #etl #kestra #orchestration #pipeline #scheduler #workflow #workflow_automation #workflow_engine
https://github.com/kestra-io/kestra
https://github.com/kestra-io/kestra
GitHub
GitHub - kestra-io/kestra: Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot.…
Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot. Simple. Fast. Scalable. - kestra-io/kestra
#python #data_engineering #data_quality #data_quality_monitoring #data_science #database #databricks_sql #dataengineering #dataquality #mysql #oracle_database #postgres #postgresql #rdbms #snowflake #sql #trino
https://github.com/datafold/data-diff
https://github.com/datafold/data-diff
GitHub
GitHub - datafold/data-diff: Compare tables within or across databases
Compare tables within or across databases. Contribute to datafold/data-diff development by creating an account on GitHub.
#other #data_analysis #data_engineering #data_science #data_visualization #deep_learning #machine_learning #mathematics #probability #python #sql #statistics
https://github.com/Moataz-Elmesmary/Data-Science-Roadmap
https://github.com/Moataz-Elmesmary/Data-Science-Roadmap
GitHub
GitHub - Moataz-Elmesmary/Data-Science-Roadmap: Data Science Roadmap from A to Z
Data Science Roadmap from A to Z. Contribute to Moataz-Elmesmary/Data-Science-Roadmap development by creating an account on GitHub.
#javascript #csv #csv_export #csv_import #csv_parser #csv_parsing #csv_reader #data_analysis #data_engineering #flatfile #mongo #mongodb #nextjs #open_source #saas #streaming
https://github.com/yobulkdev/yobulkdev
https://github.com/yobulkdev/yobulkdev
GitHub
GitHub - yobulkdev/yobulkdev: 🔥 🔥 🔥Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative
🔥 🔥 🔥Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative - yobulkdev/yobulkdev
#python #big_data #data_engineering #data_quality #data_science #feature_store #features #machine_learning #ml #mlops
https://github.com/feast-dev/feast
https://github.com/feast-dev/feast
GitHub
GitHub - feast-dev/feast: The Open Source Feature Store for AI/ML
The Open Source Feature Store for AI/ML. Contribute to feast-dev/feast development by creating an account on GitHub.
#rust #ckan #cli #csv #data_engineering #data_wrangling #datapackage #excel #geocode #luau #opendata #parquet #polars #postgresql #snappy #sql #sqlite #tsv
https://github.com/jqnatividad/qsv
https://github.com/jqnatividad/qsv
GitHub
GitHub - dathere/qsv: Blazing-fast Data-Wrangling toolkit
Blazing-fast Data-Wrangling toolkit. Contribute to dathere/qsv development by creating an account on GitHub.
❤1
#python #data #data_engineering #data_lake #data_loading #data_warehouse #elt #extract #load #transform
https://github.com/dlt-hub/dlt
https://github.com/dlt-hub/dlt
GitHub
GitHub - dlt-hub/dlt: data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️ - GitHub - dlt-hub/dlt: data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
#typescript #analytics #apache #apache_superset #asf #bi #business_analytics #business_intelligence #data_analysis #data_analytics #data_engineering #data_science #data_visualization #data_viz #flask #python #react #sql_editor #superset
Superset is a powerful business intelligence tool that helps you explore and visualize data easily. It offers a no-code interface for building charts, a robust SQL Editor for advanced queries, and support for nearly any SQL database or data engine. You can create beautiful visualizations, define custom dimensions and metrics quickly, and use a lightweight caching layer to reduce database load. Superset also provides extensible security roles and authentication options, an API for customization, and a cloud-native architecture designed for scale. This makes it easier to analyze and present your data in a user-friendly way, replacing or augmenting proprietary BI tools effectively.
https://github.com/apache/superset
Superset is a powerful business intelligence tool that helps you explore and visualize data easily. It offers a no-code interface for building charts, a robust SQL Editor for advanced queries, and support for nearly any SQL database or data engine. You can create beautiful visualizations, define custom dimensions and metrics quickly, and use a lightweight caching layer to reduce database load. Superset also provides extensible security roles and authentication options, an API for customization, and a cloud-native architecture designed for scale. This makes it easier to analyze and present your data in a user-friendly way, replacing or augmenting proprietary BI tools effectively.
https://github.com/apache/superset
GitHub
GitHub - apache/superset: Apache Superset is a Data Visualization and Data Exploration Platform
Apache Superset is a Data Visualization and Data Exploration Platform - apache/superset
🔥1
#python #analytics #dagster #data_engineering #data_integration #data_orchestrator #data_pipelines #data_science #etl #metadata #mlops #orchestration #python #scheduler #workflow #workflow_automation
Dagster is a tool that helps you manage and automate your data workflows. You can define your data assets, like tables or machine learning models, using Python functions. Dagster then runs these functions at the right time and keeps your data up-to-date. It offers features like integrated lineage and observability, making it easier to track and manage your data. This tool is useful for every stage of data development, from local testing to production, and it integrates well with other popular data tools. Using Dagster, you can build reusable components, spot data quality issues early, and scale your data pipelines efficiently. This makes your work more productive and helps maintain control over complex data systems.
https://github.com/dagster-io/dagster
Dagster is a tool that helps you manage and automate your data workflows. You can define your data assets, like tables or machine learning models, using Python functions. Dagster then runs these functions at the right time and keeps your data up-to-date. It offers features like integrated lineage and observability, making it easier to track and manage your data. This tool is useful for every stage of data development, from local testing to production, and it integrates well with other popular data tools. Using Dagster, you can build reusable components, spot data quality issues early, and scale your data pipelines efficiently. This makes your work more productive and helps maintain control over complex data systems.
https://github.com/dagster-io/dagster
GitHub
GitHub - dagster-io/dagster: An orchestration platform for the development, production, and observation of data assets.
An orchestration platform for the development, production, and observation of data assets. - dagster-io/dagster
👍1
#python #airflow #apache #apache_airflow #automation #dag #data_engineering #data_integration #data_orchestrator #data_pipelines #data_science #elt #etl #machine_learning #mlops #orchestration #python #scheduler #workflow #workflow_engine #workflow_orchestration
Apache Airflow is a tool that helps you manage and automate workflows. You can write your workflows as code, making them easier to maintain, version, test, and collaborate on. Airflow lets you schedule tasks and monitor their progress through a user-friendly interface. It supports dynamic pipeline generation, is highly extensible, and scalable, allowing you to define your own operators and executors.
Using Airflow benefits you by making your workflows more organized, efficient, and reliable. It simplifies the process of managing complex tasks and provides clear visualizations of your workflow's performance, helping you identify and troubleshoot issues quickly. This makes it easier to manage data processing and other automated tasks effectively.
https://github.com/apache/airflow
Apache Airflow is a tool that helps you manage and automate workflows. You can write your workflows as code, making them easier to maintain, version, test, and collaborate on. Airflow lets you schedule tasks and monitor their progress through a user-friendly interface. It supports dynamic pipeline generation, is highly extensible, and scalable, allowing you to define your own operators and executors.
Using Airflow benefits you by making your workflows more organized, efficient, and reliable. It simplifies the process of managing complex tasks and provides clear visualizations of your workflow's performance, helping you identify and troubleshoot issues quickly. This makes it easier to manage data processing and other automated tasks effectively.
https://github.com/apache/airflow
GitHub
GitHub - apache/airflow: Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows - apache/airflow
👍1