ogkalu2/comic-translate
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
Language:Python
Total stars: 500
Stars trend:
#python
#anime, #comics, #computervision, #dearpygui, #deeplearning, #gui, #inpainting, #machinetranslation, #manga, #manhua, #manhwa, #neuralnetwork, #ocr, #python, #pytorch, #segmentation, #textdetection, #textsegmentation, #translation, #webtoons
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
Language:Python
Total stars: 500
Stars trend:
7 Jul 2024
8pm ▏ +1
9pm +0
10pm ▏ +1
11pm ▎ +2
8 Jul 2024
12am ▎ +2
1am ██▋ +21
2am ██ +16
3am ▉ +7
4am ▊ +6
5am ▉ +7
6am █▊ +14
7am █▎ +10
#python
#anime, #comics, #computervision, #dearpygui, #deeplearning, #gui, #inpainting, #machinetranslation, #manga, #manhua, #manhwa, #neuralnetwork, #ocr, #python, #pytorch, #segmentation, #textdetection, #textsegmentation, #translation, #webtoons
Dicklesworthstone/llm_aided_ocr
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
Language:Python
Total stars: 1576
Stars trend:
#python
#aiassist, #llama2, #llm, #ocr, #ocrcorrection, #tesseract
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
Language:Python
Total stars: 1576
Stars trend:
9 Aug 2024
1pm ▏ +1
2pm +0
3pm +0
4pm ▍ +3
5pm ██████▊ +54
6pm ███████▏ +57
7pm ██████ +48
8pm ████▍ +35
9pm ████▊ +38
#python
#aiassist, #llama2, #llm, #ocr, #ocrcorrection, #tesseract
xushengfeng/eSearch
截屏 离线OCR 搜索翻译 以图搜图 贴图 录屏 万向滚动截屏 屏幕翻译 Screenshot Offline OCR Search Translate Search for picture Paste the picture on the screen Screen recorder Omnidirectional scrolling screenshot Screen translator
Language:TypeScript
Total stars: 4089
Stars trend:
#typescript
#clipboard, #colorpicker, #crossplatform, #electron, #imageediting, #imageeditor, #livetext, #ocr, #paddleocr, #screencapture, #screenrecorder, #screenshot, #search, #searchphotos
截屏 离线OCR 搜索翻译 以图搜图 贴图 录屏 万向滚动截屏 屏幕翻译 Screenshot Offline OCR Search Translate Search for picture Paste the picture on the screen Screen recorder Omnidirectional scrolling screenshot Screen translator
Language:TypeScript
Total stars: 4089
Stars trend:
13 Oct 2024
2pm █▌ +12
3pm █▎ +10
4pm █ +8
5pm ▌ +4
6pm ▎ +2
7pm ▍ +3
8pm ▏ +1
9pm ▌ +4
10pm ▍ +3
11pm ▊ +6
14 Oct 2024
12am █▊ +14
1am ██▉ +23
#typescript
#clipboard, #colorpicker, #crossplatform, #electron, #imageediting, #imageeditor, #livetext, #ocr, #paddleocr, #screencapture, #screenrecorder, #screenshot, #search, #searchphotos
siyuan-note/siyuan
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Language:TypeScript
Total stars: 19204
Stars trend:
#typescript
#anki, #chatgpt, #electron, #evernote, #knowledgebase, #localfirst, #markdown, #notetaking, #notebook, #notesapp, #notion, #obsidian, #ocr, #openai, #pdf, #pkm, #s3, #selfhosted, #webdav
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Language:TypeScript
Total stars: 19204
Stars trend:
14 Oct 2024
2am ▎ +2
3am ▏ +1
4am +0
5am ██▋ +21
6am ██ +16
7am █▏ +9
8am █▍ +11
9am ▌ +4
10am ▋ +5
11am ▋ +5
12pm ▋ +5
#typescript
#anki, #chatgpt, #electron, #evernote, #knowledgebase, #localfirst, #markdown, #notetaking, #notebook, #notesapp, #notion, #obsidian, #ocr, #openai, #pdf, #pkm, #s3, #selfhosted, #webdav
VikParuchuri/tabled
Detect and extract tables to markdown and csv
Language:Python
Total stars: 91
Stars trend:
#python
#deeplearning, #ocr, #tables
Detect and extract tables to markdown and csv
Language:Python
Total stars: 91
Stars trend:
15 Oct 2024
11am ▊ +6
12pm █▋ +13
1pm █▋ +13
2pm █▎ +10
3pm ▏ +1
4pm █▏ +9
5pm ▍ +3
6pm ▊ +6
7pm ▎ +2
8pm ▊ +6
9pm ▌ +4
10pm ▎ +2
#python
#deeplearning, #ocr, #tables
CatchTheTornado/pdf-extract-api
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Language:Python
Total stars: 250
Stars trend:
#python
#anonymization, #api, #extract, #json, #llm, #ocr, #ocrpython, #pdf, #pii
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Language:Python
Total stars: 250
Stars trend:
3 Nov 2024
2pm ▏ +1
3pm █▊ +14
4pm █▉ +15
5pm ▋ +5
6pm ▍ +3
7pm ▌ +4
8pm ▍ +3
9pm ▌ +4
10pm ▍ +3
11pm ▉ +7
4 Nov 2024
12am ▊ +6
1am █▉ +15
#python
#anonymization, #api, #extract, #json, #llm, #ocr, #ocrpython, #pdf, #pii
getomni-ai/zerox
PDF to Markdown with vision models
Language:Python
Total stars: 8139
Stars trend:
#python
#ocr, #pdf
PDF to Markdown with vision models
Language:Python
Total stars: 8139
Stars trend:
16 Jan 2025
3am ▍ +3
4am ▊ +6
5am +0
6am ▌ +4
7am ▉ +7
8am ▌ +4
9am ▌ +4
10am ▌ +4
11am █▍ +11
12pm █▏ +9
1pm █▎ +10
2pm ██▉ +23
#python
#ocr, #pdf
siyuan-note/siyuan
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Language:TypeScript
Total stars: 26661
Stars trend:
#typescript
#anki, #chatgpt, #electron, #evernote, #knowledgebase, #localfirst, #markdown, #notetaking, #notebook, #notesapp, #notion, #obsidian, #ocr, #openai, #pdf, #pkm, #s3, #selfhosted, #webdav
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Language:TypeScript
Total stars: 26661
Stars trend:
17 Jan 2025
7pm ▏ +1
8pm +0
9pm +0
10pm +0
11pm ▏ +1
18 Jan 2025
12am ███ +24
1am ███▍ +27
2am ███▏ +25
3am ███▍ +27
4am ██▉ +23
#typescript
#anki, #chatgpt, #electron, #evernote, #knowledgebase, #localfirst, #markdown, #notetaking, #notebook, #notesapp, #notion, #obsidian, #ocr, #openai, #pdf, #pkm, #s3, #selfhosted, #webdav
codexu/note-gen
一款专注于记录和写作的跨端 AI 笔记
Language:TypeScript
Total stars: 265
Stars trend:
#typescript
#ai, #app, #chatgpt, #markdown, #nextjs, #notes, #ocr, #openai, #rust, #shadcnui, #tailwindcss, #tauri
一款专注于记录和写作的跨端 AI 笔记
Language:TypeScript
Total stars: 265
Stars trend:
19 Jan 2025
9am ▎ +2
10am █▌ +12
11am ▎ +2
12pm █ +8
1pm █▉ +15
2pm █▊ +14
3pm █▋ +13
4pm ▋ +5
5pm ▍ +3
6pm ▏ +1
#typescript
#ai, #app, #chatgpt, #markdown, #nextjs, #notes, #ocr, #openai, #rust, #shadcnui, #tailwindcss, #tauri
yobix-ai/extractous
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Language:Rust
Total stars: 777
Stars trend:
#rust
#datapipelines, #docx, #etl, #etlpipelines, #extraction, #llm, #machinelearning, #naturallanguageprocessing, #nlp, #ocr, #pdf, #pdfparser, #rag, #rust, #tika, #unstructured, #unstructureddata
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Language:Rust
Total stars: 777
Stars trend:
29 Jan 2025
10pm █▏ +9
11pm ▌ +4
30 Jan 2025
12am █▎ +10
1am ▋ +5
2am █▏ +9
3am ▊ +6
4am ▉ +7
5am █ +8
6am ▉ +7
7am █ +8
8am ▋ +5
9am █ +8
#rust
#datapipelines, #docx, #etl, #etlpipelines, #extraction, #llm, #machinelearning, #naturallanguageprocessing, #nlp, #ocr, #pdf, #pdfparser, #rag, #rust, #tika, #unstructured, #unstructureddata
paperless-ngx/paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Language:Python
Total stars: 24407
Stars trend:
#python
#angular, #archiving, #django, #dms, #documentmanagement, #documentmanagementsystem, #machinelearning, #ocr, #opticalcharacterrecognition, #pdf
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Language:Python
Total stars: 24407
Stars trend:
31 Jan 2025
6am ▎ +2
7am ▎ +2
8am ▎ +2
9am ▎ +2
10am +0
11am ▉ +7
12pm █▏ +9
1pm █▉ +15
2pm █▋ +13
3pm █▉ +15
4pm █▌ +12
5pm █▋ +13
#python
#angular, #archiving, #django, #dms, #documentmanagement, #documentmanagementsystem, #machinelearning, #ocr, #opticalcharacterrecognition, #pdf
ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Language:Python
Total stars: 14952
Stars trend:
#python
#imageprocessing, #ocr, #pdf, #python, #tesseract
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Language:Python
Total stars: 14952
Stars trend:
2 Feb 2025
6am ▏ +1
7am +0
8am ▏ +1
9am ▏ +1
10am ▍ +3
11am ▎ +2
12pm ▍ +3
1pm █▎ +10
2pm ███▎ +26
3pm █▌ +12
4pm █▋ +13
5pm █▉ +15
#python
#imageprocessing, #ocr, #pdf, #python, #tesseract
Goldziher/kreuzberg
A text extraction library supporting PDFs, images, office documents and more
Language:Python
Total stars: 304
Stars trend:
#python
#asyncio, #docx, #ocr, #pdf, #textextraction
A text extraction library supporting PDFs, images, office documents and more
Language:Python
Total stars: 304
Stars trend:
15 Feb 2025
12am █ +8
1am ▋ +5
2am █ +8
3am ▊ +6
4am ▉ +7
5am ▉ +7
6am ▊ +6
7am ▎ +2
8am █ +8
9am █ +8
10am █▋ +13
#python
#asyncio, #docx, #ocr, #pdf, #textextraction
CatchTheTornado/text-extract-api
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Language:Python
Total stars: 2248
Stars trend:
#python
#anonymization, #api, #extract, #json, #llm, #ocr, #ocrpython, #pdf, #pii
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Language:Python
Total stars: 2248
Stars trend:
15 Feb 2025
6am ▉ +7
7am █▎ +10
8am ▌ +4
9am ▉ +7
10am ▉ +7
11am ▍ +3
12pm ▊ +6
1pm ▋ +5
2pm █ +8
3pm █▎ +10
4pm █ +8
5pm ▍ +3
#python
#anonymization, #api, #extract, #json, #llm, #ocr, #ocrpython, #pdf, #pii
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language:Python
Total stars: 27107
Stars trend:
#python
#ai4science, #documentanalysis, #extractdata, #layoutanalysis, #ocr, #parser, #pdf, #pdfconverter, #pdfextractorllm, #pdfextractorpretrain, #pdfextractorrag, #pdfparser, #python
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language:Python
Total stars: 27107
Stars trend:
3 Mar 2025
3am █▋ +13
4am ▋ +5
5am ▉ +7
6am █▍ +11
7am █▏ +9
8am ▊ +6
9am ▉ +7
10am █ +8
11am ▊ +6
12pm ▊ +6
1pm █ +8
2pm ▉ +7
#python
#ai4science, #documentanalysis, #extractdata, #layoutanalysis, #ocr, #parser, #pdf, #pdfconverter, #pdfextractorllm, #pdfextractorpretrain, #pdfextractorrag, #pdfparser, #python
oomol-lab/pdf-craft
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books. The project has just started.
Language:Python
Total stars: 1537
Stars trend:
#python
#ai, #document, #ocr, #pdf
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books. The project has just started.
Language:Python
Total stars: 1537
Stars trend:
10 Apr 2025
4pm ▏ +1
5pm +0
6pm +0
7pm +0
8pm +0
9pm +0
10pm ▏ +1
11pm ▏ +1
11 Apr 2025
12am ████▊ +38
1am ██████████▊ +86
2am ████████ +64
#python
#ai, #document, #ocr, #pdf
umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!
Language:C#
Total stars: 838
Stars trend:
#csharp
#asr, #csharp, #fasterwhisper, #flyleaf, #languagelearning, #llm, #mediaplayer, #ocr, #ollama, #player, #video, #videoplayer, #whisper, #wpf, #ytdlp
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!
Language:C#
Total stars: 838
Stars trend:
12 Apr 2025
3am ▎ +2
4am ▎ +2
5am ▍ +3
6am ▏ +1
7am ▌ +4
8am ▏ +1
9am ▏ +1
10am ▍ +3
11am █▎ +10
12pm ████▏ +33
1pm ▍ +3
2pm █▋ +13
#csharp
#asr, #csharp, #fasterwhisper, #flyleaf, #languagelearning, #llm, #mediaplayer, #ocr, #ollama, #player, #video, #videoplayer, #whisper, #wpf, #ytdlp
kotaro-kinoshita/yomitoku
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
Language:Python
Total stars: 697
Stars trend:
#python
#deeplearning, #layoutanalysis, #ocr, #python, #pytorch
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
Language:Python
Total stars: 697
Stars trend:
20 Apr 2025
10am ▊ +6
11am █▎ +10
12pm █▍ +11
1pm █▉ +15
2pm █▊ +14
3pm ▌ +4
4pm █ +8
5pm ▌ +4
6pm +0
7pm +0
8pm ▍ +3
9pm ▌ +4
#python
#deeplearning, #layoutanalysis, #ocr, #python, #pytorch
hiroi-sora/Umi-OCR
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Language:Python
Total stars: 33184
Stars trend:
#python
#ocr, #ocrpython, #paddleocr, #qml, #qt, #screenshot, #umiocr
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Language:Python
Total stars: 33184
Stars trend:
7 May 2025
4am ▎ +2
5am ▌ +4
6am ▎ +2
7am ▍ +3
8am ▉ +7
9am █▎ +10
10am ▊ +6
11am █▎ +10
12pm ▊ +6
1pm ▉ +7
2pm █▏ +9
3pm █▏ +9
#python
#ocr, #ocrpython, #paddleocr, #qml, #qt, #screenshot, #umiocr
clawsoftware/clawPDF
Open Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.
Language:C#
Total stars: 1043
Stars trend:
#csharp
#imageprocessing, #merge, #networkprinter, #ocr, #pdf, #pdfmerger, #pdfprinter, #print, #printer, #terminalserver, #windows
Open Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.
Language:C#
Total stars: 1043
Stars trend:
19 May 2025
12pm ▍ +3
1pm █████▌ +44
2pm ███████▎ +58
3pm ██████▌ +52
4pm ██▋ +21
#csharp
#imageprocessing, #merge, #networkprinter, #ocr, #pdf, #pdfmerger, #pdfprinter, #print, #printer, #terminalserver, #windows