🇺🇸 United States · Google Gemma — Open-Weights Small Models
Status: 🟩 COMPLETE 🟦 LIVING Section: 10 — AI and LLMs
| Vendor | Google DeepMind |
| Country/origin | 🇺🇸 United States / 🇬🇧 United Kingdom (Google DeepMind) |
| Recommended for AUS? | ✅ Yes — open-weights; can run locally (max privacy); permissive licence |
| Privacy summary | When run locally: no data sent anywhere. When used via Google Cloud / Vertex AI: standard Google enterprise data handling |
| Free tier | ✅ Completely free — open-weights download; only compute costs |
| Paid tiers | Free model; paid only if using via cloud API (Google Cloud, Hugging Face Inference, etc.) |
| First released | Gemma 1: February 2024; Gemma 2: June 2024; Gemma 3: 2025; ongoing |
| Last reviewed | June 2026 |
| Official site | https://ai.google.dev/gemma |
What it is
Gemma is Google’s family of open-weights language models — small, efficient AI models that Google releases publicly so anyone can download, run locally, fine-tune, and build products with them. Gemma is Google’s answer to Meta’s Llama in the open-weights AI race.
“Open-weights” means: the trained model file is downloadable. You can run it on your own laptop (if powerful enough), on your own servers, or in any cloud. You don’t have to use Google’s API or send data to Google. See open-weights-vs-closed for the full distinction.
The Gemma family (mid-2026):
| Model | Parameters | Best for |
|---|---|---|
| Gemma 2 2B | 2 billion | Phones, edge devices, very fast inference |
| Gemma 2 9B | 9 billion | Laptops with decent GPU; production inference |
| Gemma 2 27B | 27 billion | Workstations; near-frontier quality at smaller size |
| Gemma 3 4B / 12B / 27B | Various | Latest generation; multimodal (vision); 128K context |
| CodeGemma | 2B / 7B | Code completion and generation |
| RecurrentGemma | 2B / 9B | Long contexts; recurrent architecture |
| PaliGemma | 3B | Vision + language tasks |
| MedGemma | 27B | Healthcare research (medical text and images) |
| ShieldGemma | 2B / 9B | Content moderation classifier |
Why Gemma matters
Gemma models are notable for several reasons:
-
High quality at small size: Gemma 2 9B punches well above its weight class, performing competitively with much larger models on many benchmarks. This makes it practical for local deployment.
-
Strong multilingual support: Gemma 3 supports 140+ languages — much broader coverage than many open-weights models.
-
Permissive licence: Gemma is released under a Google licence that allows commercial use with some restrictions (significantly more permissive than some alternatives).
-
Multimodal (Gemma 3): Vision capabilities added in 2025; can process images alongside text.
-
Strong tooling integration: Excellent support in Hugging Face, Ollama, llama.cpp, and Google’s own AI tooling.
How to use Gemma (Australian users)
Run locally (private, free)
The most common way for personal use:
- Install Ollama (https://ollama.com) — the easiest tool for running open-weights models locally
- In a terminal:
ollama pull gemma2:9b(orgemma2:2bfor less powerful machines) - Run:
ollama run gemma2:9band chat away
Hardware requirements:
- Gemma 2 2B: Runs on any laptop with 8GB+ RAM
- Gemma 2 9B: Comfortable with 16GB+ RAM (Apple Silicon excellent here)
- Gemma 2 27B: Requires 32GB+ RAM or dedicated GPU with 24GB+ VRAM
Via Hugging Face
- Go to https://huggingface.co/google → search for Gemma
- Accept the licence (requires Hugging Face account)
- Download the model files or use Hugging Face Inference API
Via Google Cloud Vertex AI
For enterprise or production use, Gemma is available on Vertex AI with managed hosting.
In LM Studio or Jan.ai
GUI-based tools for running models locally, including Gemma. Easier for non-technical users than command-line Ollama.
How Gemma compares to other open-weights models
| Model | Size | Quality | Multimodal | Licence | Best for |
|---|---|---|---|---|---|
| Gemma 3 27B | 27B | Very good | ✅ (vision) | Google permissive | Multilingual; vision |
| Llama 3.3 70B | 70B | Excellent | ❌ | Meta community licence | Larger; broader knowledge |
| Mistral Small 3.1 | 24B | Very good | ✅ (vision) | Apache 2.0 (some) | EU origin; efficient |
| Phi-4 (Microsoft) | 14B | Excellent | ❌ | MIT | Tiny but capable |
| Qwen 2.5 ⛔ | Various | Excellent | ✅ | Apache 2.0 | Chinese; avoid |
| DeepSeek R2 ⛔ | Various | Excellent | ❌ | Open | Chinese; avoid |
For most Australian use cases needing an open-weights model:
- Gemma 2 9B for a laptop-friendly capable model
- Llama 3.3 70B if you have heavy hardware and need broader knowledge
- Phi-4 for the smallest footprint with excellent quality
Gemma’s specialised variants
Beyond the core Gemma models, Google has released task-specific variants:
CodeGemma
Specialised for code generation, completion, and explanation. Smaller than general models but optimised specifically for programming tasks.
PaliGemma
Vision-language model. Takes images and answers questions about them. Useful for document understanding, image captioning, and visual reasoning.
MedGemma
Trained on medical literature and able to process medical text and images. Designed for healthcare research; not for direct patient care use.
ShieldGemma
A content moderation classifier. Helps developers filter unsafe content from AI applications.
RecurrentGemma
Uses a different architecture (recurrent rather than attention-only) that enables longer context windows with constant memory use.
Licence considerations
Gemma is released under the Google Gemma Terms of Use — not standard open-source. Key points:
- ✅ Commercial use allowed
- ✅ Modification and fine-tuning allowed
- ✅ Distribution of modified versions allowed
- ⚠️ Required to comply with Google’s Prohibited Use Policy
- ⚠️ Required to provide notice that you used Gemma
- ⚠️ Some downstream uses (military applications, etc.) prohibited
The licence is more permissive than Llama’s community licence in some respects (no 100M user threshold) but more restrictive in others (specific use prohibitions).
For commercial use of Gemma in Australian products, review the current Gemma terms at ai.google.dev/gemma/terms.
Gotchas
- Smaller models have real capability limits. Gemma 2B and 9B are useful for many tasks but will fall short of Claude 3.5 Sonnet or GPT-4o on complex reasoning, very long documents, or specialised knowledge. Match model size to task.
- Open-weights licences ≠ MIT/Apache 2.0. The Gemma terms are more restrictive than standard open-source licences. Read them before commercial deployment.
- Running local models requires hardware. A modern MacBook handles Gemma 9B fine; older laptops or low-RAM machines struggle. Verify your hardware can run the size you want.
- Performance optimisation matters. Different inference engines (Ollama, llama.cpp, vLLM, MLX on Mac) have very different performance characteristics. Apple Silicon users should consider MLX for best performance.
- Multilingual quality varies. While Gemma 3 supports 140+ languages, quality is best in English, then major European/Asian languages. Quality for low-resource languages including most Australian Indigenous languages is limited.
- Not a frontier model. Gemma is excellent for an open-weights model of its size, but it’s not competing with the latest Claude/GPT-4o/Gemini Ultra on overall capability. Use frontier APIs when you need maximum capability.
Use cases where Gemma shines
- Privacy-sensitive applications: Medical notes, legal documents, personal journaling — process locally without sending data anywhere
- Offline applications: Field work, remote areas, embedded systems
- High-volume applications: Where API costs would be prohibitive
- Custom fine-tuned models: Take Gemma and fine-tune for your specific domain
- Education and research: Free to use; well-documented for learning
- Edge deployment: Phones, embedded devices, IoT — Gemma 2B works on constrained hardware
See also
- llama — Meta’s open-weights alternative; broader name recognition
- open-weights-vs-closed — when to choose open vs closed
- google-deepmind — the lab behind Gemma
- gemini-models — Google’s closed-source frontier models
- hugging-face — primary Gemma download source
- ai-hardware-overview — what hardware you need
Sources
- Google Gemma documentation: ai.google.dev/gemma
- Gemma 1 technical paper (2024)
- Gemma 2 technical report (June 2024)
- Gemma 3 model card and release notes (2025)
- Gemma Terms of Use: ai.google.dev/gemma/terms
- Hugging Face Gemma model cards
- Independent benchmarks: ArtificialAnalysis.ai (2024–2026)