GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#python #ocr #ocr_python #paddleocr #qml #qt #screenshot #umi_ocr

Umi-OCR is a free, open-source, and offline OCR (Optical Character Recognition) software that offers several benefits. Here are the key points The software is completely free to use, with all code available openly.
- **Convenient** It comes with efficient OCR engines and supports multiple languages.
- **Flexible** It includes screenshot OCR, batch OCR, PDF recognition, QR code scanning and generation, and formula recognition.

This software is easy to use, supports various file formats, and has features like ignoring regions in images to exclude unwanted text. It also supports multiple languages and themes, making it highly customizable. Overall, Umi-OCR is a powerful tool for anyone needing to extract text from images or documents efficiently.

https://github.com/hiroi-sora/Umi-OCR
#python #chineseocr #crnn #db #ocr #ocrlite

PaddleOCR is a powerful tool for Optical Character Recognition (OCR) that helps developers create and use advanced models. It supports various cutting-edge algorithms and models, such as text recognition, table recognition, and formula recognition. The tool offers low-code development capabilities, making it easy to use with simple Python APIs and graphical interfaces. This allows developers to quickly integrate and customize models for different tasks, including automated office work, financial risk control, healthcare, education, and more. It also supports deployment on various hardware like NVIDIA GPUs, Kunlun chips, and others, making it highly efficient and versatile.

https://github.com/PaddlePaddle/PaddleOCR
👍2
#typescript #clipboard #color_picker #cross_platform #electron #image_editing #image_editor #live_text #ocr #paddleocr #screen_capture #screen_recorder #screenshot #search #search_photos

eSearch is a powerful tool that helps you capture, edit, and search content on your screen. It works on Windows, Linux, and macOS. With eSearch, you can take screenshots, recognize text using OCR (even offline), translate text, and search images. You can also record your screen, add annotations, and use various editing tools like cropping, blurring, and more.

The benefit to you is that eSearch makes it easy to manage and interact with the content on your screen in multiple ways, saving you time and effort. It's especially useful for tasks like capturing and translating text from images or videos, which can be very handy for work or study.

https://github.com/xushengfeng/eSearch
#python #ocr #pdf

Zerox OCR is a simple tool to convert documents into Markdown format using AI. Here’s how it helps you you pass in your file, and Zerox OCR returns the content in Markdown format, which you can easily read and use.

This tool saves time and effort by automating the process of extracting text from complex documents, making it easier to work with the content digitally.

https://github.com/getomni-ai/zerox
#python #chineseocr #crnn #dbnet #easyocr #ocr #onnxocr #onnxruntime #openvino #paddleocr #rapidocr

RapidOCR is a free, open-source tool that quickly recognizes text from images. It is very fast, supports multiple languages like Chinese and English, and works on various platforms including Linux, Windows, and Mac. You can use it offline, which is convenient. The tool is easy to install and use, and it even allows you to customize it for specific needs. This makes it beneficial for users who need quick and accurate text recognition without relying on internet connectivity.

https://github.com/RapidAI/RapidOCR
#python #ai4science #document_analysis #extract_data #layout_analysis #ocr #parser #pdf #pdf_converter #pdf_extractor_llm #pdf_extractor_pretrain #pdf_extractor_rag #pdf_parser #python

MinerU is a tool that converts PDFs into machine-readable formats like markdown or JSON. Here are the key benefits and features MinerU removes headers, footers, and other unnecessary elements to ensure the text is semantically coherent and in human-readable order, even for complex layouts.
- **Structure Preservation** It extracts images, image descriptions, tables, and table titles.
- **Formula Conversion** Recognizes tables and converts them to LaTeX or HTML format.
- **OCR Support** Supports multiple output formats and various visualization results.
- **GPU and CPU Compatibility**: Works on both CPU and GPU environments, compatible with Windows, Linux, and Mac.

You can try MinerU through an online demo, a quick CPU demo, or by using a GPU for faster processing. For detailed usage, refer to the command line options, API integration, and deployment guides provided.

https://github.com/opendatalab/MinerU
#javascript #deep_learning #javascript #ocr #tesseract #webassembly

Tesseract.js is a JavaScript library that helps you extract text from images in almost any language. It works in both browsers and on servers using Node.js. You can easily install it using a script tag, webpack, or npm. Here’s how it benefits you: it allows you to convert images into text quickly and accurately, supporting multiple languages and formats. This can be very useful for tasks like scanning documents, recognizing text in videos, and more. The library is also efficient, with smaller file sizes and lower memory usage, making it faster to use.

https://github.com/naptha/tesseract.js
#python #image_processing #ocr #pdf #python #tesseract

OCRmyPDF is a tool that makes scanned PDF files searchable and editable. It adds a text layer to the PDF, so you can search for words or copy and paste text from the document. It supports many languages, fixes misrotated or crooked pages, and optimizes the file size. The tool works on various operating systems like Linux, Windows, and macOS, and it uses multiple CPU cores to speed up the process. This makes it easier to work with scanned documents and keeps your files organized and searchable.

https://github.com/ocrmypdf/OCRmyPDF
#kotlin #aes_256 #android #background_removal #clean_architecture #crop #djvu #edit_photo #exif #f_droid #filter_image #image_manipulation #jetpack_compose #jxl #kotlin #material_you #ocr_recognition #pdf #psd #qrcode_scanner #watermark

Image Toolbox is a powerful and versatile image editing tool that lets you do many things with your photos. You can crop, apply over 230 different filters, edit EXIF data, remove backgrounds, and even convert images to PDFs. It also allows you to add stickers and text, extract text from images in over 120 languages, and encrypt files with AES-256 encryption. You can resize images using various scaling algorithms, convert between multiple image formats, and create collages. The app also supports GIF, WEBP, APNG, and JXL conversions, document scanning, QR code scanning and creation, and more. It has a simple interface but offers many advanced features, making it useful for both photographers and developers.

https://github.com/T8RIN/ImageToolbox
#typescript #anki #chatgpt #deepseek #electron #evernote #knowledge_base #local_first #markdown #note_taking #notes_app #notion #obsidian #ocr #ollama #openai #pdf #s3 #self_hosted #webdav

SiYuan is a privacy-first personal knowledge management tool. It allows you to organize your thoughts and notes in a secure way, even offline. You can use features like block-level references, Markdown editing, and mathematical formulas. It also supports AI tools and has apps for Android, iOS, and HarmonyOS. SiYuan is open source and free for most features, making it a great choice for managing your personal knowledge securely.

https://github.com/siyuan-note/siyuan
#javascript #linux #macos #ocr #pot #pot_app #recognize #tauri #translate #translation #tts #windows

Pot is a cross-platform translation tool that lets you quickly translate text by selecting it and using a shortcut, typing text to translate, or using OCR to translate text from screenshots. It supports many translation engines like OpenAI, Google, DeepL, and more, plus offline options. You can also add plugins to extend its features and use it on Windows, macOS, and Linux. Pot offers an API for integration with other software and works well even on Wayland systems. This makes translating easier, faster, and more flexible, helping you understand and work with multiple languages efficiently.

https://github.com/pot-app/pot-desktop
#python #document_analysis #layout_analysis #ocr #parser #pdf #pdf_converter #pdf_parser #python #vlm_ocr

Dolphin is a smart AI tool that can analyze and understand complex document images, like pages with text, tables, formulas, and pictures. It works in two steps: first, it figures out the layout and reading order of the page; then, it quickly parses each element using special prompts. This makes it fast and accurate for turning document images into structured data like JSON or Markdown. You can use pre-trained models and easy code to process single pages, PDFs, or specific elements. This helps you save time and effort when extracting information from complicated documents efficiently.

https://github.com/bytedance/Dolphin