deepdoctection/deepdoctection
A Repo For Document AI
Language: Python
Total stars: 460
Stars trend:
26 Apr 2023
27 Apr 2023
#python
#documentai, #documentimageanalysis, #documentlayoutanalysis, #documentparser, #documentunderstanding, #layoutlm, #nlp, #ocr, #publaynet, #pubtabnet, #python, #pytorch, #tabledetection, #tablerecognition, #tensorflow
A Repo For Document AI
Language: Python
Total stars: 460
Stars trend:
26 Apr 2023
9pm █▉ +15
10pm ███████ +56
11pm ███████▏ +57
27 Apr 2023
12am ██████▎ +50
#python
#documentai, #documentimageanalysis, #documentlayoutanalysis, #documentparser, #documentunderstanding, #layoutlm, #nlp, #ocr, #publaynet, #pubtabnet, #python, #pytorch, #tabledetection, #tablerecognition, #tensorflow
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python
Total stars: 87
Stars trend:
#python
#datapipelines, #deeplearning, #documentparser, #documentunderstanding, #informationretrieval, #llm, #llmops, #machinelearning, #nlp, #ocr, #orchestration, #pdftotext, #preprocessing, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python
Total stars: 87
Stars trend:
1 Apr 2024
5am ▌ +4
6am █▍ +11
7am ██▊ +22
8am ██▍ +19
9am █ +8
10am ▎ +2
11am ▏ +1
12pm █▍ +11
#python
#datapipelines, #deeplearning, #documentparser, #documentunderstanding, #informationretrieval, #llm, #llmops, #machinelearning, #nlp, #ocr, #orchestration, #pdftotext, #preprocessing, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition
Filimoa/open-parse
PDF Layout Chunking for LLMs
Language:Python
Total stars: 184
Stars trend:
#python
#documentparser, #documentstructure, #documentstructureanalysis, #tabledetection, #tabledetectionusingdeeplearning
PDF Layout Chunking for LLMs
Language:Python
Total stars: 184
Stars trend:
8 Apr 2024
6am ▏ +1
7am ████▎ +34
8am ████▋ +37
9am ███▋ +29
#python
#documentparser, #documentstructure, #documentstructureanalysis, #tabledetection, #tabledetectionusingdeeplearning
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Language:HTML
Total stars: 6025
Stars trend:
#html
#datapipelines, #deeplearning, #documentimageanalysis, #documentimageprocessing, #documentparser, #documentparsing, #docx, #donut, #informationretrieval, #langchain, #llm, #machinelearning, #ml, #naturallanguageprocessing, #nlp, #ocr, #pdf, #pdftojson, #pdftotext, #preprocessing
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Language:HTML
Total stars: 6025
Stars trend:
17 Apr 2024
5pm ▎ +2
6pm ▌ +4
7pm ▍ +3
8pm ▋ +5
9pm ▊ +6
10pm ▋ +5
11pm ▋ +5
18 Apr 2024
12am ▉ +7
1am █▏ +9
2am █▋ +13
3am █▎ +10
4am ██▏ +17
#html
#datapipelines, #deeplearning, #documentimageanalysis, #documentimageprocessing, #documentparser, #documentparsing, #docx, #donut, #informationretrieval, #langchain, #llm, #machinelearning, #ml, #naturallanguageprocessing, #nlp, #ocr, #pdf, #pdftojson, #pdftotext, #preprocessing
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python
Total stars: 28162
Stars trend:
#python
#agent, #agents, #aisearch, #chatbot, #chatgpt, #datapipelines, #deeplearning, #documentparser, #documentunderstanding, #genai, #graph, #graphrag, #llm, #nlp, #pdftotext, #preprocessing, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition, #text2sql
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:Python
Total stars: 28162
Stars trend:
13 Jan 2025
11pm ▏ +1
14 Jan 2025
12am ▍ +3
1am █▎ +10
2am █▍ +11
3am █▍ +11
4am ▊ +6
5am ▋ +5
6am █▏ +9
7am █▌ +12
8am █▊ +14
9am ▊ +6
#python
#agent, #agents, #aisearch, #chatbot, #chatgpt, #datapipelines, #deeplearning, #documentparser, #documentunderstanding, #genai, #graph, #graphrag, #llm, #nlp, #pdftotext, #preprocessing, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition, #text2sql
DS4SD/docling
Get your documents ready for gen AI
Language:Python
Total stars: 18111
Stars trend:
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
Get your documents ready for gen AI
Language:Python
Total stars: 18111
Stars trend:
13 Jan 2025
11pm ▌ +4
14 Jan 2025
12am ▏ +1
1am ▍ +3
2am ▎ +2
3am ▋ +5
4am ▊ +6
5am ██▌ +20
6am █▋ +13
7am ▉ +7
8am █▏ +9
9am ▉ +7
10am ▊ +6
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
docling-project/docling
Get your documents ready for gen AI
Language:Python
Total stars: 26148
Stars trend:
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
Get your documents ready for gen AI
Language:Python
Total stars: 26148
Stars trend:
6 Apr 2025
11am ▏ +1
12pm +0
1pm ▍ +3
2pm ▏ +1
3pm ▌ +4
4pm ▌ +4
5pm ██▏ +17
6pm █ +8
7pm ▊ +6
8pm █▏ +9
9pm █▌ +12
10pm ██ +16
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:TypeScript
Total stars: 52811
Stars trend:
#typescript
#agent, #agents, #aisearch, #chatbot, #chatgpt, #deeplearning, #deepseek, #deepseekr1, #documentparser, #documentunderstanding, #graphrag, #llm, #nlp, #ollama, #pdftotext, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition, #text2sql
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Language:TypeScript
Total stars: 52811
Stars trend:
18 May 2025
10pm ▏ +1
11pm ▏ +1
19 May 2025
12am ▋ +5
1am █▍ +11
2am █▊ +14
3am █ +8
4am ▍ +3
5am ▊ +6
6am █▉ +15
7am ▉ +7
8am ██▏ +17
9am █ +8
#typescript
#agent, #agents, #aisearch, #chatbot, #chatgpt, #deeplearning, #deepseek, #deepseekr1, #documentparser, #documentunderstanding, #graphrag, #llm, #nlp, #ollama, #pdftotext, #rag, #retrievalaugmentedgeneration, #tablestructurerecognition, #text2sql