PaddlePaddle / PaddleOCR

79,84010,598+141 todayPython

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

📊 Project Info

Language: Python
Stars: ⭐ 79,840
Forks: 10,598
Today: +141
Ranking: #4
Collection: Overall
Trending Date: June 4, 2026
Last Push: 6/4/2026

🏷️ Topics

ai4sciencechineseocrdocument-parsingdocument-translationkieocrpaddleocr-vlpdf-extractor-ragpdf-parserpdf2markdownpp-ocrpp-structurerag

📸 Screenshots