Qwen 3 14B
Qwen 3 model with 14B parameters offering excellent performance-to-size efficiency
Read more→Qwen 3 model with 14B parameters offering excellent performance-to-size efficiency
Read more→MoE Qwen 3 model with 30B total parameters, activating 3B for efficient inference
Read more→Qwen3 235B Thinking is a MoE model (235B total, 22B active) optimized for complex reasoning. It generates thinking traces for deep problem solving.
Read more→Qwen API model with 1M token context support for extensive document processing
Read more→Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass
Read more→Qwen3 30B Instruct is a MoE model (30.5B total, 3.3B active). It offers strong instruction following and multilingual capabilities.
Read more→Qwen3 Next 80B Thinking is a reasoning-first MoE model (80B total). It specializes in hard multi-step problems and agentic planning.
Read more→Amazon Nova Lite 1.0 is a low-cost multimodal model with 300k context. It is optimized for speed and processing image/video inputs.
Read more→Amazon Nova Pro 1.0 is a balanced multimodal model offering accuracy and speed. It handles extensive context and is suitable for general tasks.
Read more→Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.
Read more→Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text.
Read more→Claude 3.5 Haiku is Anthropic's fastest and most cost-effective model, featuring 200k context. It excels in coding, data extraction, and real-time tasks, matching Claude 3 Opus in many benchmarks.
Read more→Latest generation Sonnet with best-in-class performance for complex agents and coding tasks
Read more→Exceptional reasoning model for specialized complex tasks requiring advanced analytical capabilities
Read more→Hybrid reasoning model with extended thinking mode for complex problem-solving and quick responses
Read more→Updated Haiku model from October 2024 with enhanced accuracy and performance
Read more→Claude Haiku 4.5 is Anthropic's fastest efficient model, delivering near-frontier intelligence. It matches Sonnet 4's performance in reasoning and coding, optimized for real-time applications.
Read more→Claude Sonnet 4.5 is Anthropic's most advanced model for real-world agents and coding. It features a 1M token context, state-of-the-art coding performance, and enhanced agentic capabilities.
Read more→Claude Opus 4.5 is Anthropic's frontier reasoning model, optimized for complex software engineering and long-horizon tasks. It supports extended thinking and multimodal capabilities.
Read more→DeepSeek V3 0324 is a cost-effective MoE model with 671B parameters. It excels in coding and problem-solving, offering a budget-friendly alternative with strong performance.
Read more→DeepSeek R1 0528 is an open-source model with 671B parameters (37B active). It offers performance on par with proprietary reasoning models, featuring fully open reasoning tokens.
Read more→Advanced DeepSeek reasoning model with RL training, comparable to OpenAI o1 in performance
Read more→Efficient MoE model with 671B parameters trained with FP8, achieving strong benchmark results
Read more→DeepSeek V3.2 Exp is an experimental model featuring DeepSeek Sparse Attention (DSA) for high efficiency. It delivers long-context handling up to 128k tokens with reduced inference costs.
Read more→DeepSeek-V3.1 is a hybrid reasoning model (671B params, 37B active) supporting thinking and non-thinking modes. It improves on V3 with better tool use, code generation, and reasoning efficiency.
Read more→DeepSeek V3.2 is the latest direct DeepSeek model featuring DeepSeek Sparse Attention (DSA) for high efficiency. It delivers long-context handling up to 163k tokens with reduced inference costs.
Read more→DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance
Read more→Gemini 2.5 Flash is Google's workhorse model for high-frequency tasks. It features a 1M context window, optimized for speed and efficiency in reasoning and multimodal processing.
Read more→Gemini 2.5 Flash-Lite is a lightweight reasoning model optimized for ultra-low latency. It offers a 1M context window and is designed for cost-effective, high-throughput applications.
Read more→Gemini 2.5 Pro is Google's best reasoning model, featuring a 1M token context window. It uses a sparse MoE architecture to excel in complex reasoning, coding, and multimodal tasks.
Read more→Gemma 3 27B is an open-source multimodal model by Google. It supports vision-language inputs, 128k context, and offers improved math and reasoning capabilities.
Read more→Specific Gemini 2.0 Flash version with stable performance and consistent behavior
Read more→Preview version of Gemini 2.5 Pro with advanced reasoning capabilities released in March 2025
Read more→Gemini 3 Pro Preview is Google's flagship frontier model. It offers high-precision multimodal reasoning across text, audio, video, and code, with a 1M token context.
Read more→Gemini 3 Flash Preview is a high speedthinking model that delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants.
Read more→Llama 4 Maverick is a 400B parameter (17B active) MoE model. It supports multimodal inputs and 1M context, optimized for assistant-like behavior.
Read more→Llama 4 Scout variant with 17B parameters and mixture-of-experts architecture for efficiency
Read more→Llama 3.3 model with 70B parameters offering improved performance over 3.1 version
Read more→FP8-quantized 17B Llama 4 Maverick model optimized for deployment efficiency and speed
Read more→Llama 4 Scout is a 109B parameter (17B active) MoE model. It is designed for efficiency and visual reasoning with a 328k context.
Read more→Phi-4 is a 14B parameter model by Microsoft. It excels in complex reasoning and limited memory environments, trained on high-quality synthetic data.
Read more→Phi-3 Mini is a 3.8B lightweight model with 128k context. It offers state-of-the-art performance for its size, suitable for edge devices.
Read more→MiniMax M2 is a 230B (10B active) MoE model. It is highly efficient, designed for coding and agentic workflows with low latency.
Read more→Updated Mistral Large from November 2024 with improved performance and capabilities
Read more→Magistral Small 2506 is a 24B parameter model by Mistral AI. It is optimized for multilingual reasoning and instruction following.
Read more→Compact 24B parameter Mistral model optimized for cost-effective instruction following
Read more→No description available.
Read more→No description available.
Read more→Mistral Large 3 2512 is Mistral's flagship MoE model (675B total, 41B active). It offers top-tier performance in reasoning and coding.
Read more→A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.
Read more→Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost.
Read more→Kimi K2 Instruct is a large open-weight model (1T params, 32B active) by Moonshot AI. It offers strong performance in instruction following and general tasks.
Read more→No description available.
Read more→Kimi K2 Thinking is a reasoning variant capable of autonomous long-horizon tasks. It can execute hundreds of sequential tool calls.
Read more→NVIDIA-tuned 253B Llama 3.1 model optimized for enterprise applications and instruction following
Read more→NVIDIA optimized 49B Llama 3.3 model providing excellent performance-to-size ratio
Read more→NVIDIA tuned 70B Llama 3.1 model with enhanced instruction following and helpfulness
Read more→Llama 3.3 Nemotron Super 49B is a reasoning model derived from Llama 3.3 70B. It is post-trained for agentic workflows, RAG, and tool calling.
Read more→Nemotron Nano 9B v2 is a compact 9B model by NVIDIA. It is a unified model for reasoning and non-reasoning tasks, trained from scratch.
Read more→Llama 3.1 Nemotron Ultra 253B is a derivative of Llama 3.1 405B, optimized for reasoning and chat. It offers a balance of accuracy and efficiency.
Read more→NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems.
Read more→A large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following.
Read more→Smaller, faster, and more affordable version of GPT-4o, ideal for high-volume applications requiring good intelligence
Read more→Enhanced iteration of GPT-4 with improved reasoning, coding, and multimodal capabilities
Read more→Most advanced OpenAI reasoning model with multimodal capabilities and agentic tool use for complex analysis
Read more→Lightweight reasoning model balancing speed and intelligence for everyday complex tasks
Read more→GPT-5 is OpenAI's latest flagship model, designed as an adaptive system. It features dynamic reasoning depth, 400k context, and improvements in accuracy and multimodal integration.
Read more→GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments.
Read more→GPT-5 Mini is a compact version of GPT-5 for lightweight reasoning. It offers low latency and cost, suitable for high-frequency tasks.
Read more→GPT-OSS-120B is an open-weight MoE model from OpenAI (117B params, 5.1B active). It is optimized for single-GPU deployment and excels in reasoning and agentic tasks.
Read more→Compact GPT-4.1 variant optimized for efficiency while maintaining strong performance
Read more→January 2025 release of o3-mini with enhanced STEM capabilities and developer features
Read more→April 2025 o4-mini release with improved reasoning efficiency and balanced performance
Read more→GPT-5.1 offers stronger general-purpose reasoning and instruction adherence than GPT-5. It features adaptive computation and a natural conversational style.
Read more→GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases.
Read more→GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamically.
Read more→gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware.
Read more→Grok 3 Mini is a lightweight, fast reasoning model from xAI. It is designed for logic-based tasks and offers accessible thinking traces.
Read more→Grok 4 is xAI's general-purpose reasoning model with 314B parameters (MoE). It features real-time data integration and strong performance in general tasks.
Read more→Grok 2 version from December 2024 with incremental improvements and optimizations
Read more→Beta version of Grok 3 with extended reasoning for complex problem-solving tasks
Read more→Grok 4.1 Fast is an agentic tool-calling model with a 2M context window. It is optimized for customer support, deep research, and real-world workflows.
Read more→Grok 4.1 Fast Thinking is the reasoning-enabled variant of Grok 4.1 Fast. It provides extended thought processes for complex problem-solving within a 2M context.
Read more→GLM-4.5 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.
Read more→GLM-4.5 Air is an efficient MoE model with 106B parameters (12B active). It is optimized for agentic applications, tool use, and speed.
Read more→GLM-4.6 is an open-weight model with 355B parameters (32B active). It uses MoE architecture to deliver state-of-the-art performance in reasoning, coding, and multimodal tasks.
Read more→