Back to Models

DeepSeek V3

Efficient MoE model with 671B parameters trained with FP8, achieving strong benchmark results

Parameters
671 B
Context
128,000 tokens
Released
Dec 26, 2024

Leaderboards

Performance vs. Industry Average

Context Window

DeepSeek V3 has a smaller context window than average (246k tokens), with a context window of 128k tokens.

DeepSeek V3 - AutoBench