Code Stars
1.93K subscribers
9.32K photos
9.61K links
Code Stars alerts you to GitHub repos gaining stars rapidly. Stay ahead of the curve and discover trending projects before they go viral! #AI #GitHub #OpenSource #Tech #MachineLearning #Python #Programming #Java #Javascript #React #Docker #Devops
Download Telegram
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Language:HTML
Total stars: 6025
Stars trend:
17 Apr 2024
5pm ▎ +2
6pm ▌ +4
7pm ▍ +3
8pm ▋ +5
9pm ▊ +6
10pm ▋ +5
11pm ▋ +5
18 Apr 2024
12am ▉ +7
1am █▏ +9
2am █▋ +13
3am █▎ +10
4am ██▏ +17

#html
#datapipelines, #deeplearning, #documentimageanalysis, #documentimageprocessing, #documentparser, #documentparsing, #docx, #donut, #informationretrieval, #langchain, #llm, #machinelearning, #ml, #naturallanguageprocessing, #nlp, #ocr, #pdf, #pdftojson, #pdftotext, #preprocessing
DS4SD/docling
Get your documents ready for gen AI
Language:Python
Total stars: 18111
Stars trend:
13 Jan 2025
11pm ▌ +4
14 Jan 2025
12am ▏ +1
1am ▍ +3
2am ▎ +2
3am ▋ +5
4am ▊ +6
5am ██▌ +20
6am █▋ +13
7am ▉ +7
8am █▏ +9
9am ▉ +7
10am ▊ +6

#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
👍1
docling-project/docling
Get your documents ready for gen AI
Language:Python
Total stars: 26148
Stars trend:
6 Apr 2025
11am ▏ +1
12pm +0
1pm ▍ +3
2pm ▏ +1
3pm ▌ +4
4pm ▌ +4
5pm ██▏ +17
6pm █ +8
7pm ▊ +6
8pm █▏ +9
9pm █▌ +12
10pm ██ +16

#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
docling-project/docling
Get your documents ready for gen AI
Language:Python
Total stars: 33012
Stars trend:
29 Jun 2025
6am ▎ +2
7am ▉ +7
8am ▎ +2
9am ▌ +4
10am █ +8
11am █▎ +10
12pm █ +8
1pm █▋ +13
2pm █▎ +10
3pm █▊ +14
4pm █▋ +13
5pm ▌ +4

#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
PaddlePaddle/PaddleOCR
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 80+ languages.
Language:Python
Total stars: 54086
Stars trend:
16 Sep 2025
8am ▎ +2
9am ▏ +1
10am ▋ +5
11am █▍ +11
12pm █▏ +9
1pm █▌ +12
2pm ██▍ +19
3pm █ +8
4pm █▏ +9
5pm ▌ +4
6pm █▊ +14
7pm ▌ +4

#python
#ai4science, #chineseocr, #documentparsing, #documenttranslation, #kie, #ocr, #pdfextractorrag, #pdfparser, #pdf2markdown, #ppocr, #ppstructure, #rag
opendataloader-project/opendataloader-pdf
Safe, Open, High-Performance — PDF for AI
Language:Java
Total stars: 290
Stars trend:
23 Sep 2025
3pm ██▌ +20
4pm ██ +16
5pm █▋ +13
6pm █▉ +15
7pm █▍ +11
8pm █▋ +13
9pm ▉ +7

#java
#ai, #dataloader, #documentparser, #documentparsing, #documents, #html, #json, #markdown, #ocrrecognition, #pdf, #pdfconverter, #pdftohtml, #pdftojson, #pdftomarkdown, #recognition, #sdk, #tables