π₯ Trending Repository: Dolphin
π Description: The official repo for βDolphin: Document Image Parsing via Heterogeneous Anchor Promptingβ, ACL, 2025.
π Repository URL: https://github.com/bytedance/Dolphin
π Readme: https://github.com/bytedance/Dolphin#readme
π Statistics:
π Stars: 6.3K stars
π Watchers: 53
π΄ Forks: 516 forks
π» Programming Languages: Python - Shell
π·οΈ Related Topics:
==================================
π§ By: https://t.me/DataScienceM
π Description: The official repo for βDolphin: Document Image Parsing via Heterogeneous Anchor Promptingβ, ACL, 2025.
π Repository URL: https://github.com/bytedance/Dolphin
π Readme: https://github.com/bytedance/Dolphin#readme
π Statistics:
π Stars: 6.3K stars
π Watchers: 53
π΄ Forks: 516 forks
π» Programming Languages: Python - Shell
π·οΈ Related Topics:
#python #pdf #parser #ocr #pdf_converter #document_analysis #pdf_parser #layout_analysis #vlm_ocr
==================================
π§ By: https://t.me/DataScienceM
π₯ Trending Repository: MinerU
π Description: Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
π Repository URL: https://github.com/opendatalab/MinerU
π Website: https://opendatalab.github.io/MinerU/
π Readme: https://github.com/opendatalab/MinerU#readme
π Statistics:
π Stars: 45.7K stars
π Watchers: 183
π΄ Forks: 3.8K forks
π» Programming Languages: Python - Dockerfile
π·οΈ Related Topics:
==================================
π§ By: https://t.me/DataScienceM
π Description: Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
π Repository URL: https://github.com/opendatalab/MinerU
π Website: https://opendatalab.github.io/MinerU/
π Readme: https://github.com/opendatalab/MinerU#readme
π Statistics:
π Stars: 45.7K stars
π Watchers: 183
π΄ Forks: 3.8K forks
π» Programming Languages: Python - Dockerfile
π·οΈ Related Topics:
#python #pdf #parser #ocr #pdf_converter #extract_data #document_analysis #pdf_parser #layout_analysis #ai4science #pdf_extractor_rag #pdf_extractor_llm #pdf_extractor_pretrain
==================================
π§ By: https://t.me/DataScienceM
β€1
π₯ Trending Repository: opendataloader-pdf
π Description: PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
π Repository URL: https://github.com/opendataloader-project/opendataloader-pdf
π Website: https://opendataloader.org
π Readme: https://github.com/opendataloader-project/opendataloader-pdf#readme
π Statistics:
π Stars: 4.7k
π Watchers: 18
π΄ Forks: 355
π» Programming Languages: Java - Python - MDX - JavaScript - TypeScript - Shell
π·οΈ Related Topics:
==================================
π§ By: https://t.me/DataScienceM
π Description: PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
π Repository URL: https://github.com/opendataloader-project/opendataloader-pdf
π Website: https://opendataloader.org
π Readme: https://github.com/opendataloader-project/opendataloader-pdf#readme
π Statistics:
π Stars: 4.7k
π Watchers: 18
π΄ Forks: 355
π» Programming Languages: Java - Python - MDX - JavaScript - TypeScript - Shell
π·οΈ Related Topics:
#html #markdown #pdf #json #ocr #ai #accessibility #a11y #pdf_converter #tables #ocr_recognition #pdf_parser #rag #bounding_box #eaa #pdf_extraction #tagged_pdf #document_parsing #pdf_accessibility #pdf_ua
==================================
π§ By: https://t.me/DataScienceM
β€2