Mistral Small 4 is the first Mistral model to unify three flagship capabilities into a single system: strong reasoning from Magistral, multimodal understanding from Pixtral, and agentic coding from Devstral. Built on a 119B-parameter Mixture-of-Experts architecture activating just 6.5B parameters per token, it offers a 256K-token context window with configurable reasoning effort — from lightweight instant responses to deep step-by-step analysis. Fully open source, it delivers a 40% reduction in latency and 3× throughput improvement over Mistral Small 3, making it a versatile and efficient choice for coding, analysis, and vision tasks.
Mistral Small 4 is the first Mistral model to unify three flagship capabilities into a single system: strong reasoning from Magistral, multimodal understanding from Pixtral, and agentic coding from Devstral. Built on a 119B-parameter Mixture-of-Experts architecture activating just 6.5B parameters per token, it offers a 256K-token context window with configurable reasoning effort — from lightweight instant responses to deep step-by-step analysis. Fully open source, it delivers a 40% reduction in latency and 3× throughput improvement over Mistral Small 3, making it a versatile and efficient choice for coding, analysis, and vision tasks.