Llama 4 Maverick

Llama 4 Maverick is Meta's natively multimodal 400B MoE model (17B active). It utilizes early fusion of text and vision tokens and was codistilled using online RL to master complex visual-reasoning tasks.

Parameters

400000000000 B

Context

1,000,000 tokens

Released

May 4, 2025

Leaderboards

Average Score combining domain-specific Autobench scores; Higher is better

Performance vs. Industry Average

Intelligence

Llama 4 Maverick is of lower intelligence compared to average (2.9), with an intelligence score of 2.3.

Price

Llama 4 Maverick is cheaper compared to average ($0.75 per 1M Tokens) with a price of $0.03 per 1M Tokens.

Latency

Llama 4 Maverick has a lower average latency compared to average (44.25s), with an average latency of 41.27s.

P99 Latency

Llama 4 Maverick has a lower P99 latency compared to average (126.46s), taking 76.10s to receive the first token at P99 (TTFT).

Context Window

Llama 4 Maverick has a larger context window than average (406k tokens), with a context window of 1000k tokens.