Back to Models

GLM 4.6

GLM-4.6 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.

Thinking Mode
Parameters
355000000000 B
Context
202,752 tokens
Released
Invalid Date

Leaderboards

Performance vs. Industry Average

Intelligence

GLM 4.6 is of higher intelligence compared to average (4.1), with an intelligence score of 4.1.

Price

GLM 4.6 is cheaper compared to average ($4.58 per 1M Tokens) with a price of $1.25 per 1M Tokens.

Latency

GLM 4.6 has a higher average latency compared to average (116.45s), with an average latency of 187.43s.

P99 Latency

GLM 4.6 has a higher P99 latency compared to average (339.37s), taking 630.49s to receive the first token at P99 (TTFT).

Context Window

GLM 4.6 has a smaller context window than average (351k tokens), with a context window of 203k tokens.

GLM 4.6 - AutoBench