Llama 4 Maverick
Llama 4 Maverick is Meta's natively multimodal 400B MoE model (17B active). It utilizes early fusion of text and vision tokens and was codistilled using online RL to master complex visual-reasoning tasks.
Leaderboards
Average Score combining domain-specific Autobench scores; Higher is better
Performance vs. Industry Average
Intelligence
Llama 4 Maverick is of lower intelligence compared to average (2.9), with an intelligence score of 2.3.
Price
Llama 4 Maverick is cheaper compared to average ($0.75 per 1M Tokens) with a price of $0.03 per 1M Tokens.
Latency
Llama 4 Maverick has a lower average latency compared to average (44.25s), with an average latency of 41.27s.
P99 Latency
Llama 4 Maverick has a lower P99 latency compared to average (126.46s), taking 76.10s to receive the first token at P99 (TTFT).
Context Window
Llama 4 Maverick has a larger context window than average (406k tokens), with a context window of 1000k tokens.