Code Stars
1.88K subscribers
8.49K photos
8.78K links
Code Stars provides notifications about GitHub repositories that are gaining a significant number of stars in a short period of time. Be the first to find out about trending repositories that everybody will be talking about soon.
#AI #chatGPT #python
Download Telegram
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python
Total stars: 87
Stars trend:
1 Apr 2024
5am ▌ +4
6am █▍ +11
7am ██▊ +22
8am ██▍ +19
9am █ +8
10am ▎ +2
11am ▏ +1
12pm █▍ +11

#python
#datapipelines, #deeplearning, #documentparser, #documentunderstanding, #informationretrieval, #llm, #llmops, #machinelearning, #nlp, #ocr, #orchestration, #pdftotext, #preprocessing, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Language:HTML
Total stars: 6025
Stars trend:
17 Apr 2024
5pm ▎ +2
6pm ▌ +4
7pm ▍ +3
8pm ▋ +5
9pm ▊ +6
10pm ▋ +5
11pm ▋ +5
18 Apr 2024
12am ▉ +7
1am █▏ +9
2am █▋ +13
3am █▎ +10
4am ██▏ +17

#html
#datapipelines, #deeplearning, #documentimageanalysis, #documentimageprocessing, #documentparser, #documentparsing, #docx, #donut, #informationretrieval, #langchain, #llm, #machinelearning, #ml, #naturallanguageprocessing, #nlp, #ocr, #pdf, #pdftojson, #pdftotext, #preprocessing
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Language:Python
Total stars: 2545
Stars trend:
10 Jun 2024
2am ▏ +1
3am +0
4am ▋ +5
5am █▊ +14
6am ▌ +4
7am ▎ +2
8am +0
9am +0
10am ▎ +2
11am ███▍ +27
12pm █▉ +15
1pm ██ +16

#python
#batchprocessing, #dataanalytics, #datapipelines, #dataprocessing, #dataflow, #etl, #etlframework, #iotanalytics, #kafka, #machinelearningalgorithms, #pathway, #python, #realtime, #rust, #streamprocessing, #streaming, #timeseriesanalysis
amphi-ai/amphi-etl
Low-code ETL for structured and unstructured data. Generates Python code you can deploy anywhere.
Language:TypeScript
Total stars: 131
Stars trend:
18 Jun 2024
6pm ▏ +1
7pm +0
8pm +0
9pm +0
10pm +0
11pm +0
19 Jun 2024
12am ▏ +1
1am █▋ +13
2am ███ +24
3am █▊ +14
4am ██▏ +17
5am █▊ +14

#typescript
#data, #datapipelines, #etl, #ragpipeline, #structureddata, #unstructureddata
fmind/mlops-python-package
Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
Language:Jupyter Notebook
Total stars: 483
Stars trend:
4 Jul 2024
6am ▉ +7
7am ████▎ +34
8am ██▎ +18
9am █▎ +10
10am █ +8
11am ▋ +5
12pm ▌ +4
1pm █ +8
2pm ▋ +5

#jupyternotebook
#automation, #datapipelines, #datascience, #machinelearning, #mlflow, #mlops, #pandera, #pydantic, #python
infinyon/fluvio
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Language:Rust
Total stars: 3033
Stars trend:
2 Sep 2024
4am ▏ +1
5am ▌ +4
6am ▏ +1
7am ▎ +2
8am +0
9am +0
10am ▏ +1
11am █▎ +10
12pm ▌ +4
1pm ███▎ +26
2pm ██▍ +19
3pm ███▏ +25

#rust
#cloudnative, #dataflow, #dataintegration, #datapipelines, #distributedsystems, #eventdrivenarchitecture, #realtime, #rust, #serverless, #stateful, #streamprocessing, #streamprocessingengine, #streaming, #streamingdata, #streamingdatapipelines, #streamingdataprocessing, #webassembly
ucbepic/docetl
A system for complex LLM-powered document processing
Language:Python
Total stars: 482
Stars trend:
28 Sep 2024
7am ▌ +4
8am █▏ +9
9am ▋ +5
10am ▎ +2
11am █▎ +10
12pm ▌ +4
1pm █▏ +9
2pm ▊ +6
3pm █▎ +10
4pm █▏ +9
5pm ▌ +4
6pm ▍ +3

#python
#data, #datapipelines, #elt, #etl, #llm, #python, #workflow
bruin-data/bruin
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Language:Python
Total stars: 201
Stars trend:
17 Dec 2024
10am ▍ +3
11am ▊ +6
12pm ▍ +3
1pm ▌ +4
2pm ▍ +3
3pm █▎ +10
4pm ▊ +6
5pm █▏ +9
6pm █▌ +12
7pm ███ +24

#python
#analytics, #bigquery, #dataanalysis, #datamodeling, #datapipelines, #datatransformation, #python, #snowflake, #sql
apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Language:Python
Total stars: 37799
Stars trend:
19 Dec 2024
8pm ▉ +7
9pm ▉ +7
10pm ▋ +5
11pm ▉ +7
20 Dec 2024
12am ▏ +1
1am █▏ +9
2am ▉ +7
3am ▍ +3
4am █▎ +10
5am ▉ +7
6am ▊ +6
7am █▎ +10

#python
#airflow, #apache, #apacheairflow, #automation, #dag, #dataengineering, #dataintegration, #dataorchestrator, #datapipelines, #datascience, #elt, #etl, #machinelearning, #mlops, #orchestration, #python, #scheduler, #workflow, #workflowengine, #workfloworchestration
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python
Total stars: 28162
Stars trend:
13 Jan 2025
11pm ▏ +1
14 Jan 2025
12am ▍ +3
1am █▎ +10
2am █▍ +11
3am █▍ +11
4am ▊ +6
5am ▋ +5
6am █▏ +9
7am █▌ +12
8am █▊ +14
9am ▊ +6

#python
#agent, #agents, #aisearch, #chatbot, #chatgpt, #datapipelines, #deeplearning, #documentparser, #documentunderstanding, #genai, #graph, #graphrag, #llm, #nlp, #pdftotext, #preprocessing, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition, #text2sql
pyper-dev/pyper
Concurrent Python made simple
Language:Python
Total stars: 136
Stars trend:
15 Jan 2025
2am █▌ +12
3am ██▊ +22
4am ██▊ +22
5am ██▋ +21
6am ██▏ +17

#python
#asyncio, #concurrency, #data, #datacollection, #dataengineering, #datapipelines, #dataprocessing, #multiprocessing, #parallelcomputing, #python, #threading
feldera/feldera
The Feldera Incremental Computation Engine
Language:Rust
Total stars: 962
Stars trend:
16 Jan 2025
5pm ▊ +6
6pm █▍ +11
7pm █▉ +15
8pm ▉ +7
9pm ▍ +3
10pm ▊ +6
11pm ▍ +3
17 Jan 2025
12am ▍ +3
1am ▍ +3
2am ▉ +7
3am ▊ +6
4am ▉ +7

#rust
#dataanalytics, #datapipelines, #database, #incrementalcomputation, #incrementalviewmaintenance, #ivm, #materializedviews, #realtime, #rust, #sql, #streaming
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Language:Python
Total stars: 12635
Stars trend:
25 Jan 2025
12am ▏ +1
1am ▎ +2
2am +0
3am +0
4am +0
5am ▏ +1
6am +0
7am █████ +40
8am ████▎ +34

#python
#batchprocessing, #dataanalytics, #datapipelines, #dataprocessing, #dataflow, #etl, #etlframework, #iotanalytics, #kafka, #machinelearningalgorithms, #pathway, #python, #realtime, #rust, #streamprocessing, #streaming, #timeseriesanalysis
yobix-ai/extractous
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Language:Rust
Total stars: 777
Stars trend:
29 Jan 2025
10pm █▏ +9
11pm ▌ +4
30 Jan 2025
12am █▎ +10
1am ▋ +5
2am █▏ +9
3am ▊ +6
4am ▉ +7
5am █ +8
6am ▉ +7
7am █ +8
8am ▋ +5
9am █ +8

#rust
#datapipelines, #docx, #etl, #etlpipelines, #extraction, #llm, #machinelearning, #naturallanguageprocessing, #nlp, #ocr, #pdf, #pdfparser, #rag, #rust, #tika, #unstructured, #unstructureddata
fmind/mlops-python-package
Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
Language:Jupyter Notebook
Total stars: 1036
Stars trend:
1 Feb 2025
9am ▎ +2
10am ▍ +3
11am ▎ +2
12pm ▍ +3
1pm █▌ +12
2pm █▎ +10
3pm █▍ +11
4pm █ +8
5pm █▎ +10
6pm ▋ +5
7pm ▊ +6
8pm ▌ +4

#jupyternotebook
#automation, #datapipelines, #datascience, #machinelearning, #mlflow, #mlops, #pandera, #pydantic, #python
infinyon/fluvio
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Language:Rust
Total stars: 4721
Stars trend:
24 Apr 2025
4pm ▌ +4
5pm ████████▎ +66
6pm █▋ +13
7pm ▎ +2
8pm ▏ +1

#rust
#cloudnative, #dataanalytics, #dataflow, #dataintegration, #datapipelines, #distributedsystems, #eventdrivenarchitecture, #realtime, #rust, #serverless, #stateful, #streamprocessing, #streamprocessingengine, #streaming, #streaminganalytics, #streamingdata, #streamingdatapipelines, #streamingdataprocessing, #webassembly
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Language:Python
Total stars: 24782
Stars trend:
17 May 2025
12am ▏ +1
1am +0
2am +0
3am +0
4am +0
5am ▌ +4
6am ▎ +2
7am █▍ +11
8am █████▌ +44
9am ███▋ +29
10am ██▊ +22

#python
#batchprocessing, #dataanalytics, #datapipelines, #dataprocessing, #dataflow, #etl, #etlframework, #iotanalytics, #kafka, #machinelearningalgorithms, #pathway, #python, #realtime, #rust, #streamprocessing, #streaming, #timeseriesanalysis