GitHub Trends

#python #ai #automl #data_science #deep_learning #devops_tools #hacktoberfest #llm #llmops #machine_learning #metadata_tracking #ml #mlops #pipelines #production_ready #pytorch #tensorflow #workflow #zenml

https://github.com/zenml-io/zenml

GitHub

GitHub - zenml-io/zenml: ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.

ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io. - zenml-io/zenml

1.49K views11:57

GitHub Trends

#go #data_science #deep_learning #distributed_training #hyperparameter_optimization #hyperparameter_search #hyperparameter_tuning #kubernetes #machine_learning #ml_infrastructure #ml_platform #mlops #pytorch #tensorflow

https://github.com/determined-ai/determined

GitHub

GitHub - determined-ai/determined: Determined is an open-source machine learning platform that simplifies distributed training…

Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow. ...

1.55K views12:56

GitHub Trends

#typescript #analytics #apache #apache_superset #asf #bi #business_analytics #business_intelligence #data_analysis #data_analytics #data_engineering #data_science #data_visualization #data_viz #flask #python #react #sql_editor #superset

Superset is a powerful business intelligence tool that helps you explore and visualize data easily. It offers a no-code interface for building charts, a robust SQL Editor for advanced queries, and support for nearly any SQL database or data engine. You can create beautiful visualizations, define custom dimensions and metrics quickly, and use a lightweight caching layer to reduce database load. Superset also provides extensible security roles and authentication options, an API for customization, and a cloud-native architecture designed for scale. This makes it easier to analyze and present your data in a user-friendly way, replacing or augmenting proprietary BI tools effectively.

https://github.com/apache/superset

GitHub

GitHub - apache/superset: Apache Superset is a Data Visualization and Data Exploration Platform

Apache Superset is a Data Visualization and Data Exploration Platform - apache/superset

🔥1

665 views13:11

GitHub Trends

#python #data_analysis #data_science #data_visualization #deep_learning #deploy #gradio #gradio_interface #hacktoberfest #interface #machine_learning #models #python #python_notebook #ui #ui_components

Gradio is a Python package that helps you quickly build and share web demos for your machine learning models or any Python function. You don't need to know JavaScript, CSS, or web hosting to use it. With just a few lines of Python code, you can create a demo and share it via a public link. Gradio offers various tools like the `Interface` class for simple demos, `ChatInterface` for chatbots, and `Blocks` for more complex custom applications. It also allows easy sharing of your demos with others by generating a public URL in seconds. This makes it easy to showcase your work without technical hassle.

https://github.com/gradio-app/gradio

GitHub

GitHub - gradio-app/gradio: Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work! - gradio-app/gradio

455 views11:45

GitHub Trends

#jupyter_notebook #data_analysis #data_science #data_visualization #pandas #python

This curriculum is designed to help beginners learn data science over 10 weeks with 20 detailed lessons. Each lesson includes pre- and post-lesson quizzes, step-by-step guides, knowledge checks, and assignments to ensure you retain the information. You'll learn about data ethics, statistics, working with different types of data, data visualization, and the entire data science lifecycle. The project-based approach helps you build practical skills while learning. Additionally, there are resources for students and teachers to make the learning process flexible and engaging. This curriculum is beneficial because it provides a structured and interactive way to gain hands-on experience in data science, making it easier to understand and apply these skills in real-world scenarios.

https://github.com/microsoft/Data-Science-For-Beginners

GitHub

GitHub - microsoft/Data-Science-For-Beginners: 10 Weeks, 20 Lessons, Data Science for All!

10 Weeks, 20 Lessons, Data Science for All! Contribute to microsoft/Data-Science-For-Beginners development by creating an account on GitHub.

👍1

6.27K views11:48

GitHub Trends

#python #analytics #dagster #data_engineering #data_integration #data_orchestrator #data_pipelines #data_science #etl #metadata #mlops #orchestration #python #scheduler #workflow #workflow_automation

Dagster is a tool that helps you manage and automate your data workflows. You can define your data assets, like tables or machine learning models, using Python functions. Dagster then runs these functions at the right time and keeps your data up-to-date. It offers features like integrated lineage and observability, making it easier to track and manage your data. This tool is useful for every stage of data development, from local testing to production, and it integrates well with other popular data tools. Using Dagster, you can build reusable components, spot data quality issues early, and scale your data pipelines efficiently. This makes your work more productive and helps maintain control over complex data systems.

https://github.com/dagster-io/dagster

GitHub

GitHub - dagster-io/dagster: An orchestration platform for the development, production, and observation of data assets.

An orchestration platform for the development, production, and observation of data assets. - dagster-io/dagster

👍1

369 views23:00

GitHub Trends

#jupyter_notebook #aws #data_science #deep_learning #examples #inference #jupyter_notebook #machine_learning #mlops #reinforcement_learning #sagemaker #training

SageMaker-Core is a new Python SDK for Amazon SageMaker that makes it easier to work with machine learning resources. It provides an object-oriented interface, which means you can manage resources like training jobs, models, and endpoints more intuitively. The SDK simplifies code by allowing resource chaining, eliminating the need to manually specify parameters. It also includes features like auto code completion, comprehensive documentation, and type hints, making it faster and less error-prone to write code. This helps developers customize their ML workloads more efficiently and streamline their development process.

https://github.com/aws/amazon-sagemaker-examples

GitHub

GitHub - aws/amazon-sagemaker-examples: Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning…

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker. - GitHub - aws/amazon-sagemaker-examples: Example 📓 Jupyter notebooks...

348 views16:30

GitHub Trends

#python #airflow #apache #apache_airflow #automation #dag #data_engineering #data_integration #data_orchestrator #data_pipelines #data_science #elt #etl #machine_learning #mlops #orchestration #python #scheduler #workflow #workflow_engine #workflow_orchestration

Apache Airflow is a tool that helps you manage and automate workflows. You can write your workflows as code, making them easier to maintain, version, test, and collaborate on. Airflow lets you schedule tasks and monitor their progress through a user-friendly interface. It supports dynamic pipeline generation, is highly extensible, and scalable, allowing you to define your own operators and executors.

Using Airflow benefits you by making your workflows more organized, efficient, and reliable. It simplifies the process of managing complex tasks and provides clear visualizations of your workflow's performance, helping you identify and troubleshoot issues quickly. This makes it easier to manage data processing and other automated tasks effectively.

https://github.com/apache/airflow

GitHub

GitHub - apache/airflow: Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows - apache/airflow

👍1

374 views14:30

GitHub Trends

#python #autogluon #automated_machine_learning #automl #computer_vision #data_science #deep_learning #ensemble_learning #forecasting #gluon #hyperparameter_optimization #machine_learning #natural_language_processing #object_detection #python #pytorch #scikit_learn #structured_data #tabular_data #time_series #transfer_learning

AutoGluon makes machine learning easy and fast. With just a few lines of code, you can train and use high-accuracy models for images, text, time series, and tabular data. This means you can quickly build and deploy powerful machine learning models without needing to write a lot of code. It supports Python 3.8 to 3.11 and works on Linux, MacOS, and Windows, making it convenient for various users. This saves time and effort, allowing you to focus on other parts of your project.

https://github.com/autogluon/autogluon

GitHub

GitHub - autogluon/autogluon: Fast and Accurate ML in 3 Lines of Code

Fast and Accurate ML in 3 Lines of Code. Contribute to autogluon/autogluon development by creating an account on GitHub.

337 views16:00

GitHub Trends

#python #artificial_intelligence #dag #data_science #data_visualization #dataflow #developer_tools #machine_learning #notebooks #pipeline #python #reactive #web_app

Marimo is a powerful tool for Python users that makes working with notebooks much easier and more efficient. Here’s what it offers When you run a cell or interact with UI elements, marimo automatically updates dependent cells, keeping your code and outputs consistent.
- **Interactive** Marimo ensures no hidden state and deterministic execution, making your work reliable.
- **Executable** Notebooks are stored as `.py` files, making version control easy.
- **Modern Editor**: It includes features like GitHub Copilot, AI assistants, and more quality-of-life tools.

Using marimo helps you avoid errors, keeps your code organized, and makes sharing and deploying your work simpler.

https://github.com/marimo-team/marimo

GitHub

GitHub - marimo-team/marimo: A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script…

A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor....

351 views13:30

GitHub Trends

#python #automation #data #data_engineering #data_ops #data_science #infrastructure #ml_ops #observability #orchestration #pipeline #prefect #python #workflow #workflow_engine

Prefect is a tool that helps you automate and manage data workflows in Python. It makes it easy to turn your scripts into reliable and flexible workflows that can handle unexpected changes. With Prefect, you can schedule tasks, retry failed operations, and monitor your workflows. You can install it using `pip install -U prefect` and start creating workflows with just a few lines of code. This helps data teams work more efficiently, reduce errors, and save time. You can also use Prefect Cloud for more advanced features and support.

https://github.com/PrefectHQ/prefect

GitHub

GitHub - PrefectHQ/prefect: Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

Prefect is a workflow orchestration framework for building resilient data pipelines in Python. - PrefectHQ/prefect

374 views14:00

GitHub Trends

#other #ai #data_science #devops #engineering #federated_learning #machine_learning #ml #mlops #software_engineering

This resource is a comprehensive guide to Machine Learning Operations (MLOps), providing a wide range of tools, articles, courses, and communities to help you manage and deploy machine learning models effectively.

**Key Benefits** Access to numerous books, articles, courses, and talks on MLOps, machine learning, and data science.
- **Community Support** Detailed guides on workflow management, feature stores, model deployment, testing, monitoring, and maintenance.
- **Infrastructure Tools** Resources on model governance, ethics, and responsible AI practices.

Using these resources, you can improve your skills in designing, training, and running machine learning models efficiently, ensuring they are reliable, scalable, and maintainable in production environments.

https://github.com/visenger/awesome-mlops

GitHub

GitHub - visenger/awesome-mlops: A curated list of references for MLOps

A curated list of references for MLOps . Contribute to visenger/awesome-mlops development by creating an account on GitHub.

👎1

470 views13:30

GitHub Trends

#python #cleandata #data_engineering #data_profilers #data_profiling #data_quality #data_science #data_unit_tests #datacleaner #datacleaning #dataquality #dataunittest #eda #exploratory_analysis #exploratory_data_analysis #exploratorydataanalysis #mlops #pipeline #pipeline_debt #pipeline_testing #pipeline_tests

GX Core is a powerful tool for ensuring data quality. It allows you to write simple tests, called "Expectations," to check if your data meets certain standards. This helps teams work together more effectively and keeps everyone informed about the data's quality. You can automatically generate reports, making it easy to share results and preserve your organization's knowledge about its data. To get started, you just need to install GX Core in a Python virtual environment and follow some simple steps. This makes managing data quality much simpler and more efficient.

https://github.com/great-expectations/great_expectations

GitHub

GitHub - great-expectations/great_expectations: Always know what to expect from your data.

Always know what to expect from your data. Contribute to great-expectations/great_expectations development by creating an account on GitHub.

612 views12:30

GitHub Trends

#python #ai #csv #data #data_analysis #data_science #data_visualization #database #datalake #gpt_4 #llm #pandas #sql #text_to_sql

PandaAI is a tool that lets you ask questions about your data using natural language. It's helpful for both non-technical and technical users. Non-technical users can interact with data more easily, while technical users can save time and effort. You can load your data, save it as a dataframe, and then ask questions like "Which are the top 5 countries by sales?" or "What is the total sales for the top 3 countries?" PandaAI also allows you to visualize charts and work with multiple datasets. It's easy to install using pip or poetry and can be used in Jupyter notebooks, Streamlit apps, or even a secure Docker sandbox. This makes it simpler and more efficient to analyze your data.

https://github.com/sinaptik-ai/pandas-ai

GitHub

GitHub - sinaptik-ai/pandas-ai: Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational…

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG. - sinaptik-ai/pandas-ai

535 views12:30

GitHub Trends

#python #ai #artificial_intelligence #cython #data_science #deep_learning #entity_linking #machine_learning #named_entity_recognition #natural_language_processing #neural_network #neural_networks #nlp #nlp_library #python #spacy #text_classification #tokenization

spaCy is a powerful tool for understanding and processing human language. It helps computers analyze text by breaking it into parts like words, sentences, and entities (like names or places). This makes it useful for tasks such as identifying who is doing what in a sentence or finding specific information from large texts. Using spaCy can save time and improve accuracy compared to manual analysis. It supports many languages and integrates well with advanced models like BERT, making it ideal for real-world applications.

https://github.com/explosion/spaCy

GitHub

GitHub - explosion/spaCy: 💫 Industrial-strength Natural Language Processing (NLP) in Python

💫 Industrial-strength Natural Language Processing (NLP) in Python - explosion/spaCy

475 views13:00

GitHub Trends

#python #agent #ai #automation #data_mining #data_science #development #llm #research

RD-Agent is a tool that helps automate research and development (R&D) tasks. It can read reports, propose new ideas, and implement them using data. This tool acts like a copilot for researchers, automating repetitive tasks or working independently to suggest better solutions. RD-Agent supports various scenarios, such as finance and medical fields, making it easier to streamline model development and data analysis. By using RD-Agent, users can save time and boost productivity in their R&D work.

https://github.com/microsoft/RD-Agent

GitHub

GitHub - microsoft/RD-Agent: Research and development (R&D) is crucial for the enhancement of industrial productivity, especially…

Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and mode...

467 views14:30

GitHub Trends

#other #automl #chatgpt #data_analysis #data_science #data_visualization #data_visualizations #deep_learning #gpt #gpt_3 #jax #keras #machine_learning #ml #nlp #python #pytorch #scikit_learn #tensorflow #transformer

This is a comprehensive, regularly updated list of 920 top open-source Python machine learning libraries, organized into 34 categories like frameworks, data visualization, NLP, image processing, and more. Each project is ranked by quality using GitHub and package manager metrics, helping you find the best tools for your needs. Popular libraries like TensorFlow, PyTorch, scikit-learn, and Hugging Face transformers are included, along with specialized ones for time series, reinforcement learning, and model interpretability. This resource saves you time by guiding you to high-quality, actively maintained libraries for building, optimizing, and deploying machine learning models efficiently.

https://github.com/ml-tooling/best-of-ml-python

GitHub

GitHub - lukasmasuch/best-of-ml-python: 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly. - lukasmasuch/best-of-ml-python

467 views13:30

GitHub Trends

#python #data_mining #data_science #deep_learning #deep_reinforcement_learning #genetic_algorithm #machine_learning #machine_learning_from_scratch

This project offers Python code for many basic machine learning models and algorithms built from scratch, focusing on clear, understandable implementations rather than speed or optimization. You can learn how these algorithms work inside by running examples like polynomial regression, convolutional neural networks, clustering, and genetic algorithms. This hands-on approach helps you deeply understand machine learning concepts and build your own custom models. Using Python makes it easier because of its simple, readable code and flexibility, letting you quickly test and modify algorithms. This can improve your skills and confidence in machine learning development.

https://github.com/eriklindernoren/ML-From-Scratch

GitHub

GitHub - eriklindernoren/ML-From-Scratch: Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models…

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep lear...

444 views12:00

GitHub Trends

#html #data_science #education #machine_learning #machine_learning_algorithms #machinelearning #machinelearning_python #microsoft_for_beginners #ml #python #r #scikit_learn #scikit_learn_python

Microsoft’s "Machine Learning for Beginners" is a free, 12-week course with 26 lessons designed to teach classic machine learning using Python and Scikit-learn. It includes quizzes, projects, and assignments to help you learn by doing, with lessons themed around global cultures to keep it engaging. You can access solutions, videos, and even R language versions. The course is beginner-friendly, flexible, and helps build practical skills step-by-step, making it easier to understand and apply machine learning concepts in real-world scenarios. This structured approach boosts your learning retention and prepares you for further study or career growth in ML[1][5].

https://github.com/microsoft/ML-For-Beginners

GitHub

GitHub - microsoft/ML-For-Beginners: 12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all - microsoft/ML-For-Beginners

486 views12:30

GitHub Trends

#other #artificial_intelligence #artificial_intelligence_projects #awesome #computer_vision #computer_vision_project #data_science #deep_learning #deep_learning_project #machine_learning #machine_learning_projects #nlp #nlp_projects #python

You can access a huge, constantly updated list of over 500 artificial intelligence projects with ready-to-use code covering machine learning, deep learning, computer vision, and natural language processing. This collection includes projects for beginners and advanced users, with links to tutorials, datasets, and real-world applications like chatbots, healthcare, and time series forecasting. Using this resource helps you learn AI by doing practical projects, speeding up your coding skills, and building a strong portfolio for jobs or research. It saves you time searching for quality projects and gives you tested, working code to study and modify.

https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code

GitHub

GitHub - ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code: 500 AI Machine learning Deep…

500 AI Machine learning Deep learning Computer vision NLP Projects with code - ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code

399 views16:00

About

Blog

Apps

Platform