MODELS

Explore the capabilities, specifications and prices of all the available models.

Company

Alibaba

Qwen 3 14B

Released

Nov 14, 2024

Parameters

140 B

Context

32,768 tokens

Qwen 3 model with 14B parameters offering excellent performance-to-size efficiency

Qwen 3 30B-A3B

Released

Jan 20, 2025

Parameters

300 B

Context

32,768 tokens

MoE Qwen 3 model with 30B total parameters, activating 3B for efficient inference

Qwen 3 235B-A22B Thinking

Thinking Mode

Released

Jul 1, 2025

Parameters

235 B

Context

128,000 tokens

Large MoE Qwen 3 thinking model with 235B parameters and advanced reasoning capabilities

Qwen Plus

Released

Sep 1, 2024

Parameters

N/A

Context

1,000,000 tokens

Qwen API model with 1M token context support for extensive document processing

Nova Lite v1

Released

Dec 3, 2024

Parameters

N/A

Context

300,000 tokens

AWS Nova lightweight model with 300K context for fast, cost-efficient text processing

Nova Pro v1

Released

Dec 3, 2024

Parameters

N/A

Context

300,000 tokens

AWS Nova advanced model with 300K context for complex reasoning and multimodal tasks

Claude 3.5 Haiku

Released

Nov 4, 2024

Parameters

N/A

Context

200,000 tokens

Advanced Haiku model matching Claude 3 Opus performance with maintained speed and cost-efficiency

Claude Sonnet 4

Released

May 22, 2025

Parameters

N/A

Context

200,000 tokens

Latest generation Sonnet with best-in-class performance for complex agents and coding tasks

Claude Opus 4.1

Released

Aug 5, 2025

Parameters

N/A

Context

200,000 tokens

Exceptional reasoning model for specialized complex tasks requiring advanced analytical capabilities

Claude 3.7 Sonnet

Thinking Mode

Released

Feb 24, 2025

Parameters

N/A

Context

200,000 tokens

Hybrid reasoning model with extended thinking mode for complex problem-solving and quick responses

Claude 3.5 Haiku (20241022)

Released

Oct 22, 2024

Parameters

N/A

Context

200,000 tokens

Updated Haiku model from October 2024 with enhanced accuracy and performance

DeepSeek V3 0324

Released

Mar 24, 2024

Parameters

671 B

Context

128,000 tokens

DeepSeek MoE model with 671B parameters (37B active) and 128K context for general tasks

DeepSeek R1 0528

Thinking Mode

Released

May 28, 2024

Parameters

671 B

Context

128,000 tokens

DeepSeek reasoning model from May 2024 with chain-of-thought capabilities for complex problems

DeepSeek R1

Thinking Mode

Released

Jan 20, 2025

Parameters

671 B

Context

128,000 tokens

Advanced DeepSeek reasoning model with RL training, comparable to OpenAI o1 in performance

DeepSeek V3

Released

Dec 26, 2024

Parameters

671 B

Context

128,000 tokens

Efficient MoE model with 671B parameters trained with FP8, achieving strong benchmark results

Gemini 2.5 Flash

Thinking Mode

Released

Jun 17, 2025

Parameters

N/A

Context

1,000,000 tokens

Advanced thinking model with adaptive reasoning, excellent price-performance, and multimodal capabilities

Gemini 2.5 Flash Lite

Released

Jun 17, 2025

Parameters

N/A

Context

1,000,000 tokens

Optimized Gemini model for cost-efficiency, high throughput, and low latency applications

Gemini 2.5 Pro

Thinking Mode

Released

Jun 17, 2025

Parameters

N/A

Context

1,000,000 tokens

Most intelligent Gemini model with enhanced reasoning for complex coding, math, and scientific tasks

Gemma 3 27B IT

Released

Feb 5, 2025

Parameters

270 B

Context

8,192 tokens

Open-source instruction-tuned Gemma model with 27B parameters for diverse language tasks

Gemini 2.0 Flash

Released

Dec 11, 2024

Parameters

N/A

Context

1,000,000 tokens

Specific Gemini 2.0 Flash version with stable performance and consistent behavior

Gemini 2.5 Pro Preview

Thinking Mode

Released

Mar 25, 2025

Parameters

N/A

Context

1,000,000 tokens

Preview version of Gemini 2.5 Pro with advanced reasoning capabilities released in March 2025

Llama 4 Maverick

Released

Apr 1, 2025

Parameters

N/A

Context

128,000 tokens

Next-generation Llama 4 model with advanced capabilities and improved training efficiency

Llama 4 Scout 17B 16E Instruct

Released

Apr 1, 2025

Parameters

170 B

Context

128,000 tokens

Llama 4 Scout variant with 17B parameters and mixture-of-experts architecture for efficiency

Llama 3.3 70B Instruct

Released

Dec 6, 2024

Parameters

700 B

Context

128,000 tokens

Llama 3.3 model with 70B parameters offering improved performance over 3.1 version

Llama 4 Maverick 17B Instruct

Released

Apr 1, 2025

Parameters

170 B

Context

128,000 tokens

FP8-quantized 17B Llama 4 Maverick model optimized for deployment efficiency and speed

Phi-4

Released

Dec 12, 2024

Parameters

140 B

Context

16,384 tokens

Latest Phi model with 14B parameters offering improved performance in reasoning and coding

Mistral Large 2411

Released

Nov 1, 2024

Parameters

123 B

Context

128,000 tokens

Updated Mistral Large from November 2024 with improved performance and capabilities

Magistral Small 2506

Released

Jun 1, 2025

Parameters

N/A

Context

32,768 tokens

Specialized Mistral model variant designed for specific enterprise use cases

Mistral Small 24B Instruct

Released

Jan 1, 2025

Parameters

240 B

Context

32,768 tokens

Compact 24B parameter Mistral model optimized for cost-effective instruction following

Kimi K2 Instruct

Thinking Mode

Released

Jan 21, 2025

Parameters

N/A

Context

128,000 tokens

Advanced Kimi model with improved reasoning depth and instruction following capabilities

Llama 3.1 Nemotron Ultra 253B v1

Released

Nov 1, 2024

Parameters

253 B

Context

128,000 tokens

NVIDIA-tuned 253B Llama 3.1 model optimized for enterprise applications and instruction following

Llama 3.3 Nemotron Super 49B v1

Released

Nov 22, 2024

Parameters

490 B

Context

128,000 tokens

NVIDIA optimized 49B Llama 3.3 model providing excellent performance-to-size ratio

Llama 3.1 Nemotron 70B Instruct

Released

Nov 1, 2024

Parameters

700 B

Context

128,000 tokens

NVIDIA tuned 70B Llama 3.1 model with enhanced instruction following and helpfulness

GPT-4o Mini

Released

Jul 18, 2024

Parameters

N/A

Context

128,000 tokens

Smaller, faster, and more affordable version of GPT-4o, ideal for high-volume applications requiring good intelligence

GPT-4.1

Released

Jan 15, 2025

Parameters

N/A

Context

128,000 tokens

Enhanced iteration of GPT-4 with improved reasoning, coding, and multimodal capabilities

O3

Thinking Mode

Released

Dec 20, 2024

Parameters

N/A

Context

200,000 tokens

Most advanced OpenAI reasoning model with multimodal capabilities and agentic tool use for complex analysis

O4 Mini

Thinking Mode

Released

Apr 16, 2025

Parameters

N/A

Context

200,000 tokens

Lightweight reasoning model balancing speed and intelligence for everyday complex tasks

GPT-5

Released

Sep 1, 2025

Parameters

N/A

Context

256,000 tokens

Next-generation GPT model with enhanced reasoning, larger context window, and improved general capabilities

GPT-5 Mini

Released

Sep 1, 2025

Parameters

N/A

Context

256,000 tokens

Efficient version of GPT-5 designed for high-throughput applications with cost optimization

GPT-5 Nano

Released

Sep 1, 2025

Parameters

N/A

Context

128,000 tokens

Ultra-compact GPT-5 variant for edge deployment and resource-constrained environments

GPT-OSS 120B

Released

Jul 1, 2025

Parameters

120 B

Context

128,000 tokens

Open-source large language model with 120B parameters offering competitive performance

GPT-4.1 Mini

Released

Jan 15, 2025

Parameters

N/A

Context

128,000 tokens

Compact GPT-4.1 variant optimized for efficiency while maintaining strong performance

o3-mini

Thinking Mode

Released

Jan 31, 2025

Parameters

N/A

Context

200,000 tokens

January 2025 release of o3-mini with enhanced STEM capabilities and developer features

o4-mini

Thinking Mode

Released

Apr 16, 2025

Parameters

N/A

Context

200,000 tokens

April 2025 o4-mini release with improved reasoning efficiency and balanced performance

Grok-3 Mini

Released

Mar 15, 2025

Parameters

N/A

Context

131,072 tokens

Compact Grok model optimized for efficient deployment with maintained intelligence

Grok-4

Thinking Mode

Released

Jun 1, 2025

Parameters

N/A

Context

131,072 tokens

Latest Grok model with advanced reasoning capabilities and extended thinking mode

Grok-2

Released

Dec 12, 2024

Parameters

N/A

Context

131,072 tokens

Grok 2 version from December 2024 with incremental improvements and optimizations

Grok-3 Beta

Thinking Mode

Released

Mar 1, 2025

Parameters

N/A

Context

131,072 tokens

Beta version of Grok 3 with extended reasoning for complex problem-solving tasks

GLM-4.5

Released

Oct 21, 2024

Parameters

N/A

Context

128,000 tokens

Enhanced GLM model with improved agentic capabilities, reasoning, and coding performance

GLM-4.5 Air

Released

Oct 21, 2024

Parameters

N/A

Context

128,000 tokens

Lightweight GLM 4.5 variant optimized for faster inference and lower computational costs

Alibaba

Qwen 3 14B

Qwen 3 30B-A3B

Qwen 3 235B-A22B Thinking

Qwen Plus

Amazon

Nova Lite v1

Nova Pro v1

Anthropic

Claude 3.5 Haiku

Claude Sonnet 4

Claude Opus 4.1

Claude 3.7 Sonnet

Claude 3.5 Haiku (20241022)

DeepSeek

DeepSeek V3 0324

DeepSeek R1 0528

DeepSeek R1

DeepSeek V3

Google

Gemini 2.5 Flash

Gemini 2.5 Flash Lite

Gemini 2.5 Pro

Gemma 3 27B IT

Gemini 2.0 Flash

Gemini 2.5 Pro Preview

Meta

Llama 4 Maverick

Llama 4 Scout 17B 16E Instruct

Llama 3.3 70B Instruct

Llama 4 Maverick 17B Instruct

Microsoft

Phi-4

Mistral AI

Mistral Large 2411

Magistral Small 2506

Mistral Small 24B Instruct

Moonshot AI

Kimi K2 Instruct

Nvidia

Llama 3.1 Nemotron Ultra 253B v1

Llama 3.3 Nemotron Super 49B v1

Llama 3.1 Nemotron 70B Instruct

OpenAI

GPT-4o Mini

GPT-4.1

O3

O4 Mini

GPT-5

GPT-5 Mini

GPT-5 Nano

GPT-OSS 120B

GPT-4.1 Mini

o3-mini

o4-mini

xAI

Grok-3 Mini

Grok-4

Grok-2

Grok-3 Beta

Zhipu AI

GLM-4.5

GLM-4.5 Air