Data Science Jupyter Notebooks
12K subscribers
291 photos
46 videos
9 files
892 links
Explore the world of Data Science through Jupyter Notebooksโ€”insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
Download Telegram
๐Ÿ”ฅ Trending Repository: PaddleOCR

๐Ÿ“ Description: Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

๐Ÿ”— Repository URL: https://github.com/PaddlePaddle/PaddleOCR

๐ŸŒ Website: https://www.paddleocr.ai

๐Ÿ“– Readme: https://github.com/PaddlePaddle/PaddleOCR#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 53.9K stars
๐Ÿ‘€ Watchers: 470
๐Ÿด Forks: 8.6K forks

๐Ÿ’ป Programming Languages: Python - C++ - Shell - Java - CMake - Cuda

๐Ÿท๏ธ Related Topics:
#ocr #db #kie #crnn #document_translation #ocrlite #chineseocr #pp_ocr #document_parsing #pp_structure #pdf2markdown #chatocr


==================================
๐Ÿง  By: https://t.me/DataScienceM
๐Ÿ”ฅ Trending Repository: Dolphin

๐Ÿ“ Description: The official repo for โ€œDolphin: Document Image Parsing via Heterogeneous Anchor Promptingโ€, ACL, 2025.

๐Ÿ”— Repository URL: https://github.com/bytedance/Dolphin

๐Ÿ“– Readme: https://github.com/bytedance/Dolphin#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 6.3K stars
๐Ÿ‘€ Watchers: 53
๐Ÿด Forks: 516 forks

๐Ÿ’ป Programming Languages: Python - Shell

๐Ÿท๏ธ Related Topics:
#python #pdf #parser #ocr #pdf_converter #document_analysis #pdf_parser #layout_analysis #vlm_ocr


==================================
๐Ÿง  By: https://t.me/DataScienceM
๐Ÿ”ฅ Trending Repository: PDFMathTranslate

๐Ÿ“ Description: PDF scientific paper translation with preserved formats - ๅŸบไบŽ AI ๅฎŒๆ•ดไฟ็•™ๆŽ’็‰ˆ็š„ PDF ๆ–‡ๆกฃๅ…จๆ–‡ๅŒ่ฏญ็ฟป่ฏ‘๏ผŒๆ”ฏๆŒ Google/DeepL/Ollama/OpenAI ็ญ‰ๆœๅŠก๏ผŒๆไพ› CLI/GUI/MCP/Docker/Zotero

๐Ÿ”— Repository URL: https://github.com/Byaidu/PDFMathTranslate

๐ŸŒ Website: https://pdf2zh.com

๐Ÿ“– Readme: https://github.com/Byaidu/PDFMathTranslate#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 28.2K stars
๐Ÿ‘€ Watchers: 104
๐Ÿด Forks: 2.5K forks

๐Ÿ’ป Programming Languages: Python

๐Ÿท๏ธ Related Topics:
#python #pdf #latex #translation #math #mcp #japanese #english #openai #translate #document #chinese #edit #modify #russian #korean #zotero #obsidian #pdf2zh


==================================
๐Ÿง  By: https://t.me/DataScienceM
๐Ÿ”ฅ Trending Repository: MinerU

๐Ÿ“ Description: Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

๐Ÿ”— Repository URL: https://github.com/opendatalab/MinerU

๐ŸŒ Website: https://opendatalab.github.io/MinerU/

๐Ÿ“– Readme: https://github.com/opendatalab/MinerU#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 45.7K stars
๐Ÿ‘€ Watchers: 183
๐Ÿด Forks: 3.8K forks

๐Ÿ’ป Programming Languages: Python - Dockerfile

๐Ÿท๏ธ Related Topics:
#python #pdf #parser #ocr #pdf_converter #extract_data #document_analysis #pdf_parser #layout_analysis #ai4science #pdf_extractor_rag #pdf_extractor_llm #pdf_extractor_pretrain


==================================
๐Ÿง  By: https://t.me/DataScienceM
โค1
๐Ÿ”ฅ Trending Repository: ragflow

๐Ÿ“ Description: RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

๐Ÿ”— Repository URL: https://github.com/infiniflow/ragflow

๐ŸŒ Website: https://ragflow.io

๐Ÿ“– Readme: https://github.com/infiniflow/ragflow#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 69.3K stars
๐Ÿ‘€ Watchers: 310
๐Ÿด Forks: 7.5K forks

๐Ÿ’ป Programming Languages: Python - TypeScript - Less - Shell - HTML - CSS

๐Ÿท๏ธ Related Topics:
#agent #ai #deep_learning #mcp #multi_agent #openai #document_parser #ai_search #rag #document_understanding #llm #agentic #retrieval_augmented_generation #ollama #deepseek #graphrag #agentic_workflow #agentic_ai #deepseek_r1 #deep_research


==================================
๐Ÿง  By: https://t.me/DataScienceM
โค1
๐Ÿ”ฅ Trending Repository: ConvertX

๐Ÿ“ Description: ๐Ÿ’พ Self-hosted online file converter. Supports 1000+ formats โš™๏ธ

๐Ÿ”— Repository URL: https://github.com/C4illin/ConvertX

๐Ÿ“– Readme: https://github.com/C4illin/ConvertX#readme

๐Ÿ“Š Statistics:
๐ŸŒŸ Stars: 10.4K stars
๐Ÿ‘€ Watchers: 24
๐Ÿด Forks: 533 forks

๐Ÿ’ป Programming Languages: TypeScript - JavaScript - Dockerfile - CSS

๐Ÿท๏ธ Related Topics:
#converter #typescript #document_conversion #convert #conversion #pdf_converter #self_hosted #file_converter #file_conversion #hacktoberfest #bun #tailwindcss #elysia


==================================
๐Ÿง  By: https://t.me/DataScienceM