AI Models

DeepSeek

DeepSeek's flagship 671-billion parameter mixture-of-experts model with 37B active parameters per token, delivering frontier-level general performance at remarkably efficient inference costs.

671B (37B active)

Falcon

TII Abu Dhabi

The Technology Innovation Institute's open-weight model family in 7B, 40B, and 180B sizes, trained on the massive RefinedWeb dataset and pioneering the use of high-quality filtered web data for LLM training.

7B40B180B

Gemma 3

Google

Google's latest open-weight model family built on Gemini technology, available in 1B, 4B, 12B, and 27B sizes with native multimodal vision-language capabilities and a 128K token context window.

1B4B12B

InternLM

Shanghai AI Lab

Shanghai AI Laboratory's multilingual model series in 7B and 20B sizes, featuring strong Chinese-English capabilities, long-context support, and excellent performance on reasoning and tool-use benchmarks.

7B20B

Llama 3

Llama 4

Mistral 7B

Mistral AI

Mistral AI's foundational 7-billion parameter model that punches well above its weight class, featuring sliding window attention and grouped-query attention for efficient long-context inference.

Mixtral

Mistral AI

Mistral AI's mixture-of-experts models that route each token through 2 of 8 expert networks, delivering 70B-class performance at the cost of a 13B dense model in the 8x7B variant.

8x7B8x22B

Neural Chat

Intel

Intel's 7-billion parameter conversational model fine-tuned from Mistral 7B, optimized for Intel hardware and demonstrating strong chat performance with particular focus on CPU inference efficiency.

OLMo

Allen AI

Allen Institute for AI's fully open language model family in 1B, 7B, and 13B sizes, with completely open training data, code, weights, and evaluation — setting the standard for reproducible AI research.

1B7B13B

OpenChat

OpenChat

A 7-billion parameter model fine-tuned from Mistral 7B using Conditioned Reinforcement Learning Fine-Tuning (C-RLFT), achieving GPT-3.5-level performance through a novel mixed-quality data training approach.

Phi-3

Microsoft

Microsoft's family of compact yet capable language models available in 3.8B, 7B, and 14B sizes, designed for on-device and edge deployment with surprisingly strong performance on reasoning and instruction-following tasks.

3.8B7B14B

Phi-4

Microsoft

Microsoft's 14-billion parameter small language model that emphasizes reasoning quality through synthetic data training, achieving performance competitive with models several times its size on math and logic benchmarks.

14B

Qwen 2.5

Alibaba

Alibaba's comprehensive open-weight model family spanning seven sizes from 0.5B to 72B parameters, with particularly strong multilingual and coding capabilities across 29+ languages.

0.5B1.5B3B

Qwen 3

Alibaba

Alibaba's latest-generation model family featuring both dense and mixture-of-experts architectures, with sizes from 0.6B to 235B and built-in hybrid thinking modes for adaptive reasoning depth.

0.6B1.7B4B

SmolLM

HuggingFace

HuggingFace's family of ultra-compact language models in 135M, 360M, and 1.7B sizes, trained on the high-quality Cosmopedia synthetic dataset and designed for on-device AI applications with minimal resource requirements.

135M360M1.7B

SOLAR

Upstage

Upstage's 10.7-billion parameter model created through depth up-scaling, a novel technique that merges and extends a pretrained model's layers to achieve larger-model quality at efficient inference cost.

10.7B

StarCoder

Code

BigCode / HuggingFace

An open-access code generation model trained on permissively licensed source code, available in 3B, 7B, and 15B sizes with transparent training data governance and strong multi-language programming support.

3B7B15B

TinyLlama

TinyLlama Team

A compact 1.1-billion parameter model trained on 3 trillion tokens — far more data than typical for its size — delivering surprisingly capable performance for edge deployment, mobile applications, and resource-constrained environments.

1.1B

Vicuna

LMSYS

LMSYS's instruction-tuned model family in 7B, 13B, and 33B sizes, fine-tuned from Llama on ShareGPT conversations and widely recognized for pioneering open-source chatbot evaluation methodology.

7B13B33B

Yi

01.AI

01.AI's bilingual Chinese-English model family available in 6B, 9B, and 34B sizes, known for strong performance on both Chinese and English benchmarks with excellent instruction-following capabilities.

6B9B34B

Zephyr