Llama 4 Maverick is Meta's high-capacity multimodal language model built on a Mixture-of-Experts architecture with 128 experts and 17 billion active parameters per forward pass (400B total). It supports multilingual text and image input across 12 languages with a 1-million-token context window, and uses early fusion for native multimodality. Instruction-tuned for assistant-like interaction and image reasoning, Maverick is optimized for vision-language tasks requiring advanced multimodal understanding and high throughput.
Llama 4 Maverick is Meta's high-capacity multimodal language model built on a Mixture-of-Experts architecture with 128 experts and 17 billion active parameters per forward pass (400B total). It supports multilingual text and image input across 12 languages with a 1-million-token context window, and uses early fusion for native multimodality. Instruction-tuned for assistant-like interaction and image reasoning, Maverick is optimized for vision-language tasks requiring advanced multimodal understanding and high throughput.