GitHub repos

clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language: Python
#computer_vision #document_ai #eccv_2022 #multimodal_pre_trained_model #nlp #ocr
Stars: 98 Issues: 2 Forks: 5
https://github.com/clovaai/donut

GitHub

GitHub - clovaai/donut: Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator…

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022 - clovaai/donut

❤1

2.15K views04:19

GitHub repos

OpenGVLab/InternChat
InternChat allows you to interact with ChatGPT by clicking, dragging and drawing using a pointing device.
Language: Python
#chatgpt #click #foundation_model #gpt #gpt_4 #gradio #husky #image_captioning #internimage #langchain #llama #llm #multimodal #ocr #sam #segment_anything #vicuna #video #video_generation #vqa
Stars: 231 Issues: 1 Forks: 10
https://github.com/OpenGVLab/InternChat

GitHub

GitHub - OpenGVLab/InternGPT: InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now…

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin...

2.32K views22:10

GitHub repos

Danily07/Translumo
Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc.
Language: C#
#autotranslate #easyocr #game_translation #mlnet #ocr #translation
Stars: 239 Issues: 5 Forks: 4
https://github.com/Danily07/Translumo

GitHub

GitHub - Danily07/Translumo: Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc.

Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc. - Danily07/Translumo

👍9👏3🤔2

4.03K views16:13

GitHub repos

junhoyeo/BetterOCR
🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract) with 🧠 LLM.
Language: Python
#ai #chatgpt #chatgpt_api #easyocr #llm #ocr #openai #openai_api #tesseract #tesseract_ocr
Stars: 154 Issues: 4 Forks: 7
https://github.com/junhoyeo/BetterOCR

GitHub

GitHub - junhoyeo/BetterOCR: 🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠…

🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM. - junhoyeo/BetterOCR

👍3👎1

2.38K views22:20

GitHub repos

reworkd/tarsier
Vision utilities for web interaction agents 👀
Language: Jupyter Notebook
#gpt4v #llms #ocr #playwright #pypi_package #python #selenium #webscraping
Stars: 236 Issues: 3 Forks: 14
https://github.com/reworkd/tarsier

GitHub

GitHub - reworkd/tarsier: Vision utilities for web interaction agents 👀

Vision utilities for web interaction agents 👀. Contribute to reworkd/tarsier development by creating an account on GitHub.

2.02K views23:21

GitHub repos

VikParuchuri/texify
OCR model for math that outputs LaTeX and markdown
Language: Python
#deep_learning #latex #markdown #ocr
Stars: 142 Issues: 0 Forks: 7
https://github.com/VikParuchuri/texify

GitHub

GitHub - VikParuchuri/texify: Math OCR model that outputs LaTeX and markdown

Math OCR model that outputs LaTeX and markdown. Contribute to VikParuchuri/texify development by creating an account on GitHub.

👍1

2.12K views05:24

GitHub repos

robertknight/ocrs
A modern OCR engine (extracts text from images), written in Rust
Language: Rust
#computer_vision #machine_learning #ocr
Stars: 220 Issues: 3 Forks: 4
https://github.com/robertknight/ocrs

GitHub

GitHub - robertknight/ocrs: Rust library and CLI tool for OCR (extracting text from images)

Rust library and CLI tool for OCR (extracting text from images) - robertknight/ocrs

🥰1👏1

2.28K views17:24

GitHub repos

VikParuchuri/tabled
Detect and extract tables to markdown and csv
Language: Python
#deep_learning #ocr #tables
Stars: 245 Issues: 4 Forks: 7
https://github.com/VikParuchuri/tabled

GitHub

GitHub - VikParuchuri/tabled: Detect and extract tables to markdown and csv

Detect and extract tables to markdown and csv. Contribute to VikParuchuri/tabled development by creating an account on GitHub.

👍1

1.83K views16:00

GitHub repos

umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
Language: C#
#asr #csharp #flyleaf #language_learning #media_player #ocr #player #tesseract #video #video_player #whisper #wpf #yt_dlp
Stars: 253 Issues: 5 Forks: 4
https://github.com/umlx5h/LLPlayer

GitHub

GitHub - umlx5h/LLPlayer: The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation…

The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more! - umlx5h/LLPlayer

❤1👍1

1.85K views23:00

GitHub repos

ses4255/Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
Language: Python
#doclayout #educational_data #exam_ocr #machine_learning #ml_datasets #multi_modal #ocr #openai #paper_ocr #table_parsing
Stars: 250 Issues: 0 Forks: 11
https://github.com/ses4255/Versatile-OCR-Program

GitHub

GitHub - ses4255/Versatile-OCR-Program: Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)

Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams) - ses4255/Versatile-OCR-Program

❤1👍1

1.74K views10:00

About

Blog

Apps

Platform