Back to Models

Nemotron 3 Super 120B A12B

Nemotron 3 Super is a 120B parameter hybrid Mamba-Transformer model (12B active). It utilizes LatentMoE and Multi-Token Prediction (MTP) to maximize compute efficiency for complex RAG and IT ticket automation.

Thinking Mode
Parameters
120000000000 B
Context
262,000 tokens
Released
Nov 3, 2026

Leaderboards

Performance vs. Industry Average

Intelligence

Nemotron 3 Super 120B A12B is of lower intelligence compared to average (2.8), with an intelligence score of 2.7.

Price

Nemotron 3 Super 120B A12B is cheaper compared to average ($0.67 per 1M Tokens) with a price of $0.07 per 1M Tokens.

Latency

Nemotron 3 Super 120B A12B has a higher average latency compared to average (45.95s), with an average latency of 71.87s.

P99 Latency

Nemotron 3 Super 120B A12B has a higher P99 latency compared to average (131.50s), taking 245.44s to receive the first token at P99 (TTFT).

Context Window

Nemotron 3 Super 120B A12B has a smaller context window than average (401k tokens), with a context window of 262k tokens.

Nemotron 3 Super 120B A12B - AutoBench