MODELS

Explore the capabilities, specifications and prices of all the available models.

Company

Alibaba

Qwen 3 14B

Released
Nov 14, 2024
Parameters
140 B
Context
32,768 tokens

Qwen 3 model with 14B parameters offering excellent performance-to-size efficiency

Read more

Qwen 3 30B-A3B

Released
Jan 20, 2025
Parameters
300 B
Context
32,768 tokens

MoE Qwen 3 model with 30B total parameters, activating 3B for efficient inference

Read more

Qwen3 235B A22B Thinking 2507

Thinking Mode
Released
Invalid Date
Parameters
235000000000 B
Context
262,144 tokens

Qwen3 235B Thinking is a MoE model (235B total, 22B active) optimized for complex reasoning. It generates thinking traces for deep problem solving.

Read more

Qwen Plus

Released
Sep 1, 2024
Parameters
N/A
Context
1,000,000 tokens

Qwen API model with 1M token context support for extensive document processing

Read more

Qwen3 235b a22b 2507

Released
Invalid Date
Parameters
235000000000 B
Context
262,144 tokens

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass

Read more

Qwen3 30b a3b instruct 2507

Released
Jan 7, 2025
Parameters
30500000000 B
Context
262,144 tokens

Qwen3 30B Instruct is a MoE model (30.5B total, 3.3B active). It offers strong instruction following and multilingual capabilities.

Read more

Qwen3 next 80b a3b thinking

Thinking Mode
Released
Nov 9, 2025
Parameters
80000000000 B
Context
262,144 tokens

Qwen3 Next 80B Thinking is a reasoning-first MoE model (80B total). It specializes in hard multi-step problems and agentic planning.

Read more

Amazon

Nova lite v1

Released
May 12, 2024
Parameters
N/A
Context
300,000 tokens

Amazon Nova Lite 1.0 is a low-cost multimodal model with 300k context. It is optimized for speed and processing image/video inputs.

Read more

Nova pro v1

Released
May 12, 2024
Parameters
N/A
Context
300,000 tokens

Amazon Nova Pro 1.0 is a balanced multimodal model offering accuracy and speed. It handles extensive context and is suitable for general tasks.

Read more

Nova Premier v1

Released
Sep 10, 2024
Parameters
N/A
Context
1,000,000 tokens

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

Read more

Nova 2 lite v1

Thinking Mode
Released
Feb 12, 2024
Parameters
N/A
Context
1,000,000 tokens

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text.

Read more

Anthropic

Claude 3.5 haiku

Released
Invalid Date
Parameters
N/A
Context
200,000 tokens

Claude 3.5 Haiku is Anthropic's fastest and most cost-effective model, featuring 200k context. It excels in coding, data extraction, and real-time tasks, matching Claude 3 Opus in many benchmarks.

Read more

Claude Sonnet 4

Released
May 22, 2025
Parameters
N/A
Context
200,000 tokens

Latest generation Sonnet with best-in-class performance for complex agents and coding tasks

Read more

Claude Opus 4.1

Released
Aug 5, 2025
Parameters
N/A
Context
200,000 tokens

Exceptional reasoning model for specialized complex tasks requiring advanced analytical capabilities

Read more

Claude 3.7 Sonnet

Thinking Mode
Released
Feb 24, 2025
Parameters
N/A
Context
200,000 tokens

Hybrid reasoning model with extended thinking mode for complex problem-solving and quick responses

Read more

Claude 3.5 Haiku (20241022)

Released
Oct 22, 2024
Parameters
N/A
Context
200,000 tokens

Updated Haiku model from October 2024 with enhanced accuracy and performance

Read more

Claude haiku 4.5

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
200,000 tokens

Claude Haiku 4.5 is Anthropic's fastest efficient model, delivering near-frontier intelligence. It matches Sonnet 4's performance in reasoning and coding, optimized for real-time applications.

Read more

Claude sonnet 4.5

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,000,000 tokens

Claude Sonnet 4.5 is Anthropic's most advanced model for real-world agents and coding. It features a 1M token context, state-of-the-art coding performance, and enhanced agentic capabilities.

Read more

Claude opus 4.5

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
200,000 tokens

Claude Opus 4.5 is Anthropic's frontier reasoning model, optimized for complex software engineering and long-horizon tasks. It supports extended thinking and multimodal capabilities.

Read more

DeepSeek

DeepSeek V3 0324

Released
Invalid Date
Parameters
671000000000 B
Context
128,000 tokens

DeepSeek V3 0324 is a cost-effective MoE model with 671B parameters. It excels in coding and problem-solving, offering a budget-friendly alternative with strong performance.

Read more

DeepSeek R1 0528

Thinking Mode
Released
Invalid Date
Parameters
671000000000 B
Context
163,840 tokens

DeepSeek R1 0528 is an open-source model with 671B parameters (37B active). It offers performance on par with proprietary reasoning models, featuring fully open reasoning tokens.

Read more

DeepSeek R1

Thinking Mode
Released
Jan 20, 2025
Parameters
671 B
Context
128,000 tokens

Advanced DeepSeek reasoning model with RL training, comparable to OpenAI o1 in performance

Read more

DeepSeek V3

Released
Dec 26, 2024
Parameters
671 B
Context
128,000 tokens

Efficient MoE model with 671B parameters trained with FP8, achieving strong benchmark results

Read more

Deepseek v3.2 exp

Released
Invalid Date
Parameters
685000000000 B
Context
163,840 tokens

DeepSeek V3.2 Exp is an experimental model featuring DeepSeek Sparse Attention (DSA) for high efficiency. It delivers long-context handling up to 128k tokens with reduced inference costs.

Read more

Deepseek v3.1

Released
Invalid Date
Parameters
671000000000 B
Context
163,840 tokens

DeepSeek-V3.1 is a hybrid reasoning model (671B params, 37B active) supporting thinking and non-thinking modes. It improves on V3 with better tool use, code generation, and reasoning efficiency.

Read more

Deepseek v3.2

Released
Jan 12, 2025
Parameters
685000000000 B
Context
163,840 tokens

DeepSeek V3.2 is the latest direct DeepSeek model featuring DeepSeek Sparse Attention (DSA) for high efficiency. It delivers long-context handling up to 163k tokens with reduced inference costs.

Read more

DeepSeek 3.2 Speciale

Thinking Mode
Released
Jan 12, 2025
Parameters
685000000000 B
Context
163,840 tokens

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance

Read more

Google

Gemini 2.5 flash

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,048,576 tokens

Gemini 2.5 Flash is Google's workhorse model for high-frequency tasks. It features a 1M context window, optimized for speed and efficiency in reasoning and multimodal processing.

Read more

Gemini 2.5 flash lite

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,000,000 tokens

Gemini 2.5 Flash-Lite is a lightweight reasoning model optimized for ultra-low latency. It offers a 1M context window and is designed for cost-effective, high-throughput applications.

Read more

Gemini 2.5 pro

Thinking Mode
Released
Jan 9, 2025
Parameters
N/A
Context
1,000,000 tokens

Gemini 2.5 Pro is Google's best reasoning model, featuring a 1M token context window. It uses a sparse MoE architecture to excel in complex reasoning, coding, and multimodal tasks.

Read more

Gemma 3 27b it

Released
Dec 3, 2025
Parameters
27000000000 B
Context
131,072 tokens

Gemma 3 27B is an open-source multimodal model by Google. It supports vision-language inputs, 128k context, and offers improved math and reasoning capabilities.

Read more

Gemini 2.0 Flash

Released
Dec 11, 2024
Parameters
N/A
Context
1,000,000 tokens

Specific Gemini 2.0 Flash version with stable performance and consistent behavior

Read more

Gemini 2.5 Pro Preview

Thinking Mode
Released
Mar 25, 2025
Parameters
N/A
Context
1,000,000 tokens

Preview version of Gemini 2.5 Pro with advanced reasoning capabilities released in March 2025

Read more

Gemini 3 pro preview

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,048,576 tokens

Gemini 3 Pro Preview is Google's flagship frontier model. It offers high-precision multimodal reasoning across text, audio, video, and code, with a 1M token context.

Read more

Gemini 3 Flash Preview

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
1,048,576 tokens

Gemini 3 Flash Preview is a high speedthinking model that delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants.

Read more

Meta

Llama 4 maverick

Released
May 4, 2025
Parameters
400000000000 B
Context
1,048,576 tokens

Llama 4 Maverick is a 400B parameter (17B active) MoE model. It supports multimodal inputs and 1M context, optimized for assistant-like behavior.

Read more

Llama 4 Scout 17B 16E Instruct

Released
Apr 1, 2025
Parameters
170 B
Context
128,000 tokens

Llama 4 Scout variant with 17B parameters and mixture-of-experts architecture for efficiency

Read more

Llama 3.3 70B Instruct

Released
Dec 6, 2024
Parameters
700 B
Context
128,000 tokens

Llama 3.3 model with 70B parameters offering improved performance over 3.1 version

Read more

Llama 4 Maverick 17B Instruct

Released
Apr 1, 2025
Parameters
170 B
Context
128,000 tokens

FP8-quantized 17B Llama 4 Maverick model optimized for deployment efficiency and speed

Read more

Llama 4 scout

Released
May 4, 2025
Parameters
109000000000 B
Context
327,680 tokens

Llama 4 Scout is a 109B parameter (17B active) MoE model. It is designed for efficiency and visual reasoning with a 328k context.

Read more

Microsoft

Phi 4

Thinking Mode
Released
Oct 1, 2025
Parameters
14000000000 B
Context
16,384 tokens

Phi-4 is a 14B parameter model by Microsoft. It excels in complex reasoning and limited memory environments, trained on high-quality synthetic data.

Read more

Phi 3 mini 128k instruct

Released
Invalid Date
Parameters
3800000000 B
Context
128,000 tokens

Phi-3 Mini is a 3.8B lightweight model with 128k context. It offers state-of-the-art performance for its size, suitable for edge devices.

Read more

Minimax

Minimax m2

Thinking Mode
Released
Invalid Date
Parameters
230000000000 B
Context
204,800 tokens

MiniMax M2 is a 230B (10B active) MoE model. It is highly efficient, designed for coding and agentic workflows with low latency.

Read more

Mistral AI

Mistral Large 2411

Released
Nov 1, 2024
Parameters
123 B
Context
128,000 tokens

Updated Mistral Large from November 2024 with improved performance and capabilities

Read more

Magistral small 2506

Released
Oct 6, 2025
Parameters
24000000000 B
Context
40,000 tokens

Magistral Small 2506 is a 24B parameter model by Mistral AI. It is optimized for multilingual reasoning and instruction following.

Read more

Mistral Small 24B Instruct

Released
Jan 1, 2025
Parameters
240 B
Context
32,768 tokens

Compact 24B parameter Mistral model optimized for cost-effective instruction following

Read more

Magistral Medium 2506

Released
N/A
Parameters
N/A
Context
N/A

No description available.

Read more

Mistral Small 3.2 24B Instruct

Released
N/A
Parameters
N/A
Context
N/A

No description available.

Read more

Mistral large 2512

Thinking Mode
Released
Jan 12, 2025
Parameters
675000000000 B
Context
262,144 tokens

Mistral Large 3 2512 is Mistral's flagship MoE model (675B total, 41B active). It offers top-tier performance in reasoning and coding.

Read more

Ministral 8b 2512

Released
Invalid Date
Parameters
N/A
Context
262,144 tokens

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

Read more

Mistral medium 3.1

Released
Feb 12, 2024
Parameters
N/A
Context
131,072 tokens

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost.

Read more

Moonshot AI

Kimi K2 Instruct

Released
Jan 7, 2025
Parameters
1000000000000 B
Context
256,000 tokens

Kimi K2 Instruct is a large open-weight model (1T params, 32B active) by Moonshot AI. It offers strong performance in instruction following and general tasks.

Read more

Kimi K2 0905

Released
N/A
Parameters
N/A
Context
N/A

No description available.

Read more

Kimi k2 thinking

Thinking Mode
Released
Jan 7, 2025
Parameters
1000000000000 B
Context
256,000 tokens

Kimi K2 Thinking is a reasoning variant capable of autonomous long-horizon tasks. It can execute hundreds of sequential tool calls.

Read more

Nvidia

Llama 3.1 Nemotron Ultra 253B v1

Released
Nov 1, 2024
Parameters
253 B
Context
128,000 tokens

NVIDIA-tuned 253B Llama 3.1 model optimized for enterprise applications and instruction following

Read more

Llama 3.3 Nemotron Super 49B v1

Released
Nov 22, 2024
Parameters
490 B
Context
128,000 tokens

NVIDIA optimized 49B Llama 3.3 model providing excellent performance-to-size ratio

Read more

Llama 3.1 Nemotron 70B Instruct

Released
Nov 1, 2024
Parameters
700 B
Context
128,000 tokens

NVIDIA tuned 70B Llama 3.1 model with enhanced instruction following and helpfulness

Read more

Llama 3.3 nemotron super 49b v1.5

Thinking Mode
Released
Invalid Date
Parameters
49000000000 B
Context
131,072 tokens

Llama 3.3 Nemotron Super 49B is a reasoning model derived from Llama 3.3 70B. It is post-trained for agentic workflows, RAG, and tool calling.

Read more

Nemotron nano 9b v2

Released
May 9, 2025
Parameters
9000000000 B
Context
131,072 tokens

Nemotron Nano 9B v2 is a compact 9B model by NVIDIA. It is a unified model for reasoning and non-reasoning tasks, trained from scratch.

Read more

Llama 3.1 nemotron ultra 253b v1

Thinking Mode
Released
Jul 4, 2025
Parameters
253000000000 B
Context
131,072 tokens

Llama 3.1 Nemotron Ultra 253B is a derivative of Llama 3.1 405B, optimized for reasoning and chat. It offers a balance of accuracy and efficiency.

Read more

Nemotron nano 9b v2

Thinking Mode
Released
Invalid Date
Parameters
30000000000 B
Context
256,000 tokens

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems.

Read more

Olmo

Olmo 3.1 32b Think

Thinking Mode
Released
Invalid Date
Parameters
32000000000 B
Context
65,536 tokens

A large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following.

Read more

OpenAI

GPT-4o Mini

Released
Jul 18, 2024
Parameters
N/A
Context
128,000 tokens

Smaller, faster, and more affordable version of GPT-4o, ideal for high-volume applications requiring good intelligence

Read more

GPT-4.1

Released
Jan 15, 2025
Parameters
N/A
Context
128,000 tokens

Enhanced iteration of GPT-4 with improved reasoning, coding, and multimodal capabilities

Read more

O3

Thinking Mode
Released
Dec 20, 2024
Parameters
N/A
Context
200,000 tokens

Most advanced OpenAI reasoning model with multimodal capabilities and agentic tool use for complex analysis

Read more

O4 Mini

Thinking Mode
Released
Apr 16, 2025
Parameters
N/A
Context
200,000 tokens

Lightweight reasoning model balancing speed and intelligence for everyday complex tasks

Read more

Gpt 5

Thinking Mode
Released
Jul 8, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5 is OpenAI's latest flagship model, designed as an adaptive system. It features dynamic reasoning depth, 400k context, and improvements in accuracy and multimodal integration.

Read more

Gpt 5 mini

Thinking Mode
Released
Jul 8, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments.

Read more

Gpt 5 nano

Thinking Mode
Released
Jul 8, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5 Mini is a compact version of GPT-5 for lightweight reasoning. It offers low latency and cost, suitable for high-frequency tasks.

Read more

Gpt oss 120b

Thinking Mode
Released
Apr 8, 2025
Parameters
117000000000 B
Context
131,072 tokens

GPT-OSS-120B is an open-weight MoE model from OpenAI (117B params, 5.1B active). It is optimized for single-GPU deployment and excels in reasoning and agentic tasks.

Read more

GPT-4.1 Mini

Released
Jan 15, 2025
Parameters
N/A
Context
128,000 tokens

Compact GPT-4.1 variant optimized for efficiency while maintaining strong performance

Read more

o3-mini

Thinking Mode
Released
Jan 31, 2025
Parameters
N/A
Context
200,000 tokens

January 2025 release of o3-mini with enhanced STEM capabilities and developer features

Read more

o4-mini

Thinking Mode
Released
Apr 16, 2025
Parameters
N/A
Context
200,000 tokens

April 2025 o4-mini release with improved reasoning efficiency and balanced performance

Read more

Gpt 5.1

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
400,000 tokens

GPT-5.1 offers stronger general-purpose reasoning and instruction adherence than GPT-5. It features adaptive computation and a natural conversational style.

Read more

Gpt 5.2 Pro

Thinking Mode
Released
Oct 12, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases.

Read more

Gpt 5.2

Thinking Mode
Released
Oct 12, 2025
Parameters
N/A
Context
400,000 tokens

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically.

Read more

Gpt oss 20b

Thinking Mode
Released
Apr 8, 2025
Parameters
19500000000 B
Context
131,072 tokens

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware.

Read more

xAI

Grok 3 mini

Thinking Mode
Released
Oct 6, 2025
Parameters
N/A
Context
131,072 tokens

Grok 3 Mini is a lightweight, fast reasoning model from xAI. It is designed for logic-based tasks and offers accessible thinking traces.

Read more

Grok 4

Thinking Mode
Released
Sep 7, 2025
Parameters
314000000000 B
Context
256,000 tokens

Grok 4 is xAI's general-purpose reasoning model with 314B parameters (MoE). It features real-time data integration and strong performance in general tasks.

Read more

Grok-2

Released
Dec 12, 2024
Parameters
N/A
Context
131,072 tokens

Grok 2 version from December 2024 with incremental improvements and optimizations

Read more

Grok-3 Beta

Thinking Mode
Released
Mar 1, 2025
Parameters
N/A
Context
131,072 tokens

Beta version of Grok 3 with extended reasoning for complex problem-solving tasks

Read more

Grok 4.1 fast

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
2,000,000 tokens

Grok 4.1 Fast is an agentic tool-calling model with a 2M context window. It is optimized for customer support, deep research, and real-world workflows.

Read more

Grok 4.1 fast thinking

Thinking Mode
Released
Invalid Date
Parameters
N/A
Context
2,000,000 tokens

Grok 4.1 Fast Thinking is the reasoning-enabled variant of Grok 4.1 Fast. It provides extended thought processes for complex problem-solving within a 2M context.

Read more

Zhipu AI

GLM 4.5

Thinking Mode
Released
Invalid Date
Parameters
355000000000 B
Context
128,000 tokens

GLM-4.5 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.

Read more

GLM 4.5 Air

Thinking Mode
Released
Invalid Date
Parameters
106000000000 B
Context
128,000 tokens

GLM-4.5 Air is an efficient MoE model with 106B parameters (12B active). It is optimized for agentic applications, tool use, and speed.

Read more

GLM 4.6

Thinking Mode
Released
Invalid Date
Parameters
355000000000 B
Context
202,752 tokens

GLM-4.6 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.

Read more
Models - AutoBench