Llama 4 Maverick 17B Instruct

FP8-quantized 17B Llama 4 Maverick model optimized for deployment efficiency and speed

Parameters

170 B

Context

128,000 tokens

Released

Apr 1, 2025

Leaderboards

QUALITY

Average Score combining domain-specific Autobench scores; Higher is better

llama-4-Maverick-17B-128E-Instruct-FP8
N/A

PRICE

USD cent per average answer; Lower is better

llama-4-Maverick-17B-128E-Instruct-FP8
N/A

LATENCY

Average Latency in Seconds; Lower is better

llama-4-Maverick-17B-128E-Instruct-FP8
N/A

Performance vs. Industry Average

Context Window

Llama 4 Maverick 17B Instruct has a smaller context window than average (246k tokens), with a context window of 128k tokens.

Llama 4 Maverick 17B Instruct - AutoBench