AI API Cheat Sheet β Quick Reference for Developers
Status: π© COMPLETE π¦ LIVING Section: cheat-sheets Tags: cheat-sheet, api, developer, reference
How to read this
Quick reference for the main AI APIs developers use. Each section shows the SDK installation, basic call pattern, and common gotchas.
For setup guides, see:
OpenAI (ChatGPT / GPT models)
Install
pip install openai
# or
npm install openaiBasic call (Python)
from openai import OpenAI
client = OpenAI(api_key="sk-...")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)Common models
gpt-4oβ flagship; multi-modalgpt-4o-miniβ cheap; fasto3β best reasoningo3-miniβ cheaper reasoninggpt-4o-realtime-previewβ voice agent
Pricing (USD per million tokens)
- gpt-4o-mini: 0.60 out
- gpt-4o: 10 out
- o3: 40 out
Key parameters
temperature(0-2): 0 = deterministic; 1 = balanced; 2 = creativemax_tokens: cap response lengthtop_p: nucleus samplingstream=True: stream tokens as generatedtools=[...]: function callingresponse_format={"type": "json_object"}: force JSON output
Gotchas
- Models change names; check current docs
o3doesnβt supportsystemmessage (usedeveloperrole)- Costs add up fast with verbose responses β set
max_tokens - Free $5 credit on new accounts; expires in 3 months
Anthropic (Claude)
Install
pip install anthropic
# or
npm install @anthropic-ai/sdkBasic call (Python)
import anthropic
client = anthropic.Anthropic(api_key="sk-ant-...")
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude!"}
]
)
print(message.content[0].text)Common models (mid-2026)
claude-haiku-4-5β fast; cheapclaude-sonnet-4-6β best value; most popularclaude-opus-4-7β most capable; reasoning
Pricing (USD per million tokens)
- Haiku: 4 out
- Sonnet: 15 out
- Opus: 75 out
Key parameters
max_tokens: REQUIRED (unlike OpenAI)temperature(0-1): 0 = deterministic; 1 = creativesystem="...": system prompt (separate from messages)stream=True: streamingtools=[...]: tool use (function calling)thinking={"type": "enabled"}: extended thinking on Sonnet+
Extended thinking
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
thinking={"type": "enabled", "budget_tokens": 2048},
messages=[...]
)Gotchas
max_tokensis mandatory- Context window is 200K tokens
- API requires phone verification
- Pro subscription β API credits (separate billing)
Google AI (Gemini)
Install
pip install google-generativeai
# or
npm install @google/generative-aiBasic call (Python)
import google.generativeai as genai
genai.configure(api_key="AIza...")
model = genai.GenerativeModel('gemini-2.5-flash')
response = model.generate_content("Hello, Gemini!")
print(response.text)Common models
gemini-2.5-flashβ fast; cheap; 1M contextgemini-2.5-proβ best quality; 2M contextgemini-2.5-flash-8bβ smallest; cheapestgemini-2.5-flash-thinkingβ reasoning mode
Pricing (USD per million tokens)
- Flash: 0.30 out
- Pro: 10 out
- Flash-8b: 0.15 out
Key parameters
generation_config={"temperature": 0.7, "max_output_tokens": 1024}safety_settings={...}: content moderationtools=[...]: function callingstream=True: streaming
Multimodal (with image)
import PIL.Image
img = PIL.Image.open("path/to/image.jpg")
response = model.generate_content(["Describe this image", img])Gotchas
- Free tier has rate limits (15 req/min Flash; 2 req/min Pro)
- AI Studio vs Vertex AI β different products, different APIs
- Model names change β verify current
Mistral
Install
pip install mistralai
# or
npm install @mistralai/mistralaiBasic call (Python)
from mistralai import Mistral
client = Mistral(api_key="...")
response = client.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)Common models
mistral-small-latestβ fast; cheapmistral-medium-latestβ balancedmistral-large-latestβ flagshipcodestral-latestβ coding specialisedmistral-saba-latestβ Arabic specialised
Pricing (USD per million tokens)
- Small: 0.30 out
- Medium: ~1.20 out
- Large: 6 out
Gotchas
- EU-hosted (data residency advantage)
- Open-weights models available for self-hosting
- Codestral has commercial licence requirement for some uses
Groq (fastest inference)
Install
pip install groq
# or
npm install groq-sdkBasic call (Python)
from groq import Groq
client = Groq(api_key="gsk_...")
response = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)Common models
llama-3.3-70b-versatileβ Llama 3.3 largellama-3.1-8b-instantβ Llama smallmixtral-8x7b-32768β Mixtralgemma2-9b-itβ Gemma
Pricing
- Generally significantly cheaper than direct OpenAI/Anthropic
- Llama 70B: ~0.79 out per million
Gotchas
- Speed advantage: ~500-800 tokens/sec
- Rate limits apply
- Models are open-weights; quality matches direct hosting
Common patterns across providers
Streaming (Python pattern)
# OpenAI
for chunk in client.chat.completions.create(model="gpt-4o", messages=[...], stream=True):
print(chunk.choices[0].delta.content or "", end="")
# Anthropic
with client.messages.stream(model="claude-sonnet-4-6", max_tokens=1024, messages=[...]) as stream:
for text in stream.text_stream:
print(text, end="")
# Google
for chunk in model.generate_content("...", stream=True):
print(chunk.text, end="")Tool use / function calling
# Define tools
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}]
# Pass to API
response = client.chat.completions.create(
model="gpt-4o",
messages=[...],
tools=tools
)JSON mode
# OpenAI - native JSON mode
response = client.chat.completions.create(
model="gpt-4o",
messages=[...],
response_format={"type": "json_object"}
)
# Anthropic - via system prompt
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="Respond only in valid JSON. No markdown.",
messages=[...]
)Async (Python)
# OpenAI async
from openai import AsyncOpenAI
client = AsyncOpenAI()
response = await client.chat.completions.create(...)
# Anthropic async
from anthropic import AsyncAnthropic
client = AsyncAnthropic()
response = await client.messages.create(...)Environment variables (recommended)
Never hardcode API keys. Use environment variables:
.env file
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIza...
Load in Python
import os
from dotenv import load_dotenv
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))Load in Node.js
import 'dotenv/config';
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });Always
- Add
.envto.gitignore - Never commit keys to git
- Rotate keys if accidentally exposed
Error handling
Common errors and meanings
401 Unauthorizedβ invalid API key429 Too Many Requestsβ rate limited500 / 503β provider issue; retry400 Bad Requestβ your request is invalid402 Payment Required(or insufficient credits) β top up
Retry with backoff (Python pattern)
import time
from openai import OpenAI, RateLimitError
def call_with_retry(client, **kwargs):
for attempt in range(5):
try:
return client.chat.completions.create(**kwargs)
except RateLimitError:
time.sleep(2 ** attempt)
raise Exception("Max retries exceeded")Token counting
Approximate before sending:
tiktoken (OpenAI)
import tiktoken
encoding = tiktoken.encoding_for_model("gpt-4o")
tokens = encoding.encode("Your text here")
print(len(tokens)) # token countAnthropic counts via API
client.messages.count_tokens(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Your text"}]
)Rough rule: 1 token β 0.75 words English text.
See also
- openai-api
- claude-api-overview
- get-an-openai-api-key
- get-an-anthropic-api-key
- get-a-google-ai-api-key
- pricing-snapshot
Sources
- OpenAI API docs: platform.openai.com/docs
- Anthropic API docs: docs.anthropic.com
- Google AI Studio docs: ai.google.dev/gemini-api/docs
- Mistral docs: docs.mistral.ai
- Groq docs: console.groq.com/docs