AI 비교하기AI 교차검증AI 최신정보AI 커뮤니티
Our VisionTermsPrivacyFAQContact

Mistral AI Launches Mistral OCR 4

Mistral AI Launches Mistral OCR 4

Mistral AI
Wednesday, June 24, 2026
  • •Mistral AI released Mistral OCR 4 with bounding boxes, block classification, and inline confidence scores.
  • •The model supports 170 languages and achieves an 85.20 score on the OlmOCRBench.
  • •API pricing is $4 per 1,000 pages, with a $2 rate available for batch processing.
  • •Mistral AI released Mistral OCR 4 with bounding boxes, block classification, and inline confidence scores.
  • •The model supports 170 languages and achieves an 85.20 score on the OlmOCRBench.
  • •API pricing is $4 per 1,000 pages, with a $2 rate available for batch processing.

Mistral AI released Mistral OCR 4 on June 23, 2026, an updated document parsing model designed for structured content extraction. The model produces extracted text along with bounding boxes, block classification (identifying titles, tables, equations, and signatures), and inline confidence scores for individual words and pages. It supports 170 languages across 10 language groups, maintaining high accuracy for rare or low-resource languages where competing systems often underperform. Organizations can deploy the model in a single container for self-hosted environments to meet data residency and compliance requirements.

In performance evaluations, the company reported that independent annotators preferred OCR 4 over leading document-AI systems with an average win rate of 72%. On the public OlmOCRBench, the model achieved a score of 85.20, and it reached 93.07 on OmniDocBench. Mistral AI noted that these benchmarks often contain artifacts such as incorrect ground-truth labels and math notation mismatches that can affect aggregate scoring. The model is intended as an ingestion engine for downstream systems like Retrieval-Augmented Generation (RAG—a technique that retrieves external data to ground model responses) and automated agentic workflows, rather than as a decision-making model for high-stakes fields like legal or medical advice.

Developers can access the model via API or through Mistral Studio. Pricing is set at $4 per 1,000 pages, with a 50% discount for Batch-API usage, bringing the cost to $2 per 1,000 pages. The Document AI feature, which provides structured JSON output shaped to specific schemas, is priced at $5 per 1,000 pages. The model is available through various platforms including Amazon SageMaker and Microsoft Foundry. Integration with the open-source Mistral Search Toolkit allows developers to incorporate the model's structured outputs—such as citation-ready inputs—directly into enterprise search and retrieval pipelines.

Mistral AI released Mistral OCR 4 on June 23, 2026, an updated document parsing model designed for structured content extraction. The model produces extracted text along with bounding boxes, block classification (identifying titles, tables, equations, and signatures), and inline confidence scores for individual words and pages. It supports 170 languages across 10 language groups, maintaining high accuracy for rare or low-resource languages where competing systems often underperform. Organizations can deploy the model in a single container for self-hosted environments to meet data residency and compliance requirements.

In performance evaluations, the company reported that independent annotators preferred OCR 4 over leading document-AI systems with an average win rate of 72%. On the public OlmOCRBench, the model achieved a score of 85.20, and it reached 93.07 on OmniDocBench. Mistral AI noted that these benchmarks often contain artifacts such as incorrect ground-truth labels and math notation mismatches that can affect aggregate scoring. The model is intended as an ingestion engine for downstream systems like Retrieval-Augmented Generation (RAG—a technique that retrieves external data to ground model responses) and automated agentic workflows, rather than as a decision-making model for high-stakes fields like legal or medical advice.

Developers can access the model via API or through Mistral Studio. Pricing is set at $4 per 1,000 pages, with a 50% discount for Batch-API usage, bringing the cost to $2 per 1,000 pages. The Document AI feature, which provides structured JSON output shaped to specific schemas, is priced at $5 per 1,000 pages. The model is available through various platforms including Amazon SageMaker and Microsoft Foundry. Integration with the open-source Mistral Search Toolkit allows developers to incorporate the model's structured outputs—such as citation-ready inputs—directly into enterprise search and retrieval pipelines.

Read original (English)
#mistral ai#ocr#document processing#rag#data extraction