What are the key points?

PaddlePaddle released PP-OCRv6 with model sizes ranging from 1.5M to 34.5M parameters. The new OCR model family supports 50 languages including Chinese, English, and Japanese. PP-OCRv6_medium achieves 86.2% detection Hmean, improving detection by 4.6 percentage points over PP-OCRv5.

PaddlePaddle Launches PP-OCRv6 Multilingual OCR Models

•PaddlePaddle released PP-OCRv6 with model sizes ranging from 1.5M to 34.5M parameters.
•The new OCR model family supports 50 languages including Chinese, English, and Japanese.
•PP-OCRv6_medium achieves 86.2% detection Hmean, improving detection by 4.6 percentage points over PP-OCRv5.

PaddlePaddle released PP-OCRv6 on June 22, 2026, marking the latest update to its universal optical character recognition (OCR) model family. The release provides three model tiers, ranging from 1.5M to 34.5M parameters, designed to support real-world text detection and recognition across diverse inputs including documents, industrial labels, and scene text. The medium and small variants support 50 languages, encompassing Simplified Chinese, Traditional Chinese, English, Japanese, and 46 Latin-script languages.

On official multi-scenario benchmarks, the PP-OCRv6_medium model reached 86.2% detection Hmean and 83.2% recognition accuracy. Compared to its predecessor, PP-OCRv5_server, this version delivers a 4.6 percentage point increase in text detection and a 5.1 percentage point increase in recognition. The tiny, small, and medium tiers are categorized by their specific use cases: PP-OCRv6_tiny (1.5M params) targets edge devices and latency-sensitive demos; PP-OCRv6_small (7.7M params) serves mobile and desktop applications; and PP-OCRv6_medium (34.5M params) supports server-side pipelines and high-accuracy document ingestion.

The architectural updates feature the PPLCNetV4 backbone across all tiers, ensuring consistency within the model family. The text detection module now utilizes RepLKFPN (a lightweight large-kernel feature pyramid network) to handle challenges such as small, rotated, or dense text. For recognition, the model employs EncoderWithLightSVTR, which integrates local context modeling with global attention to process complex characters and noisy image regions. These structural refinements aim to improve accuracy while maintaining efficient performance for various deployment environments.

Developers can integrate PP-OCRv6 through PaddlePaddle, Transformers, or ONNX Runtime backends. PaddleOCR 3.7 offers a unified interface for selecting these inference engines, allowing users to deploy models in formats such as Paddle inference, ONNX, and safetensors. The library provides structured JSON output and visualization images, enabling direct integration into downstream systems like document parsing, search extraction, RAG (retrieval-augmented generation, a technique for providing LLMs with external data), and agentic workflows. The model assets and documentation are available via the Hugging Face Hub, with an online demo provided for immediate evaluation.

PaddlePaddle released PP-OCRv6 on June 22, 2026, marking the latest update to its universal optical character recognition (OCR) model family. The release provides three model tiers, ranging from 1.5M to 34.5M parameters, designed to support real-world text detection and recognition across diverse inputs including documents, industrial labels, and scene text. The medium and small variants support 50 languages, encompassing Simplified Chinese, Traditional Chinese, English, Japanese, and 46 Latin-script languages.

On official multi-scenario benchmarks, the PP-OCRv6_medium model reached 86.2% detection Hmean and 83.2% recognition accuracy. Compared to its predecessor, PP-OCRv5_server, this version delivers a 4.6 percentage point increase in text detection and a 5.1 percentage point increase in recognition. The tiny, small, and medium tiers are categorized by their specific use cases: PP-OCRv6_tiny (1.5M params) targets edge devices and latency-sensitive demos; PP-OCRv6_small (7.7M params) serves mobile and desktop applications; and PP-OCRv6_medium (34.5M params) supports server-side pipelines and high-accuracy document ingestion.

The architectural updates feature the PPLCNetV4 backbone across all tiers, ensuring consistency within the model family. The text detection module now utilizes RepLKFPN (a lightweight large-kernel feature pyramid network) to handle challenges such as small, rotated, or dense text. For recognition, the model employs EncoderWithLightSVTR, which integrates local context modeling with global attention to process complex characters and noisy image regions. These structural refinements aim to improve accuracy while maintaining efficient performance for various deployment environments.

Developers can integrate PP-OCRv6 through PaddlePaddle, Transformers, or ONNX Runtime backends. PaddleOCR 3.7 offers a unified interface for selecting these inference engines, allowing users to deploy models in formats such as Paddle inference, ONNX, and safetensors. The library provides structured JSON output and visualization images, enabling direct integration into downstream systems like document parsing, search extraction, RAG (retrieval-augmented generation, a technique for providing LLMs with external data), and agentic workflows. The model assets and documentation are available via the Hugging Face Hub, with an online demo provided for immediate evaluation.