Gemma 4 (e2b / e4b)
Quality at 2B-4B scale: Best in class
Gemma 4's edge variants are the strongest open-weight small models of 2026. The e2b (~2B effective) at Q4_K_M is approximately 1.5GB — fitting on phones, embedded devices, and any system with 4GB+ memory — and uniquely supports image input despite the small size. The e4b (~4B effective) extends quality further while remaining laptop-deployable. Both are released under Apache 2.0 (the first Gemma generation with this license), making commercial deployment straightforward. For mobile chat, on-device assistants, and camera-based AI applications, no other open-weight family currently matches the e2b at the 2B scale.
Strengths
- e2b at ~1.5GB fits on phones and any 4GB+ memory device
- Native multimodal — even the 2B variant accepts image input
- Apache 2.0 license (new in Gemma 4) — no commercial restrictions
- First-class MLX support for Apple Silicon deployment
Trade-offs
- Doesn't match larger models (8B+) on complex reasoning tasks
- Multimodal support adds some inference complexity vs text-only models