πŸ‡ΊπŸ‡Έ USA Β· Cloudflare AI Gateway

Status: 🟩 COMPLETE 🟦 LIVING Last updated: 2026-06-26 Plain-English tagline: A proxy that sits between your app and AI providers (OpenAI, Anthropic, Google, etc.) β€” giving you central observability, caching, rate-limiting, and cost-tracking across all your AI calls. Free.


Front-matter facts

FieldValue
VendorCloudflare Inc (San Francisco, USA)
Country / originπŸ‡ΊπŸ‡Έ USA
Recommended for Australian users?βœ… Yes β€” Cloudflare global edge incl AUS
Privacy summaryCloudflare proxies but doesn’t train on data; underlying provider’s privacy posture applies for the actual AI work
Free tierYes β€” completely free (Cloudflare’s positioning)
Paid tiersNone separately β€” bundled with Cloudflare account
First released2024
Last reviewed2026-06-26
Official sitehttps://developers.cloudflare.com/ai-gateway

What it is

Cloudflare AI Gateway is a proxy / observability layer that sits between your app and AI providers (OpenAI, Anthropic, Google, Mistral, Cohere, Hugging Face, Replicate, Groq, etc.). Instead of calling provider APIs directly, you call AI Gateway, which forwards to the provider.

Benefits:

  • Central observability β€” see all your AI calls across providers in one dashboard
  • Cost tracking β€” actual spend across multiple AI vendors
  • Caching β€” cache identical requests (saves money on repeated queries)
  • Rate limiting β€” per-app / per-user rate limits
  • Fallback β€” if one provider is down, route to another
  • Logging / replay β€” capture requests for debugging
  • No vendor lock-in β€” same Gateway works with any provider

Why Cloudflare offers this free: it nudges developers into the Cloudflare ecosystem; Cloudflare upsells Workers AI, R2, D1, Vectorize, etc. once you’re already using their account.


What you’d use it for

  • Multi-provider AI app with central monitoring
  • Cost tracking across multiple AI APIs
  • Caching identical / similar requests to save money
  • Rate limiting per-user / per-app
  • A/B testing between providers (route X% to Claude, Y% to GPT)
  • Fallback when a provider has an outage
  • Centralised logging for AI work
  • Personal projects wanting cost visibility

How to use from Australia

  1. Cloudflare account (free)
  2. Dashboard β†’ AI β†’ AI Gateway β†’ Create Gateway
  3. Get the Gateway URL β€” something like https://gateway.ai.cloudflare.com/v1/{account-id}/{gateway-name}/openai
  4. Replace provider URL in your code with Gateway URL
  5. Calls now flow through Cloudflare; visible in dashboard

Example (drop-in OpenAI replacement):

from openai import OpenAI
client = OpenAI(
    api_key="...",
    base_url="https://gateway.ai.cloudflare.com/v1/{account-id}/{gateway-name}/openai"
)
# Now all calls show up in Cloudflare AI Gateway dashboard

What it costs

Free

  • AI Gateway itself is free
  • You still pay your underlying AI provider (OpenAI, Anthropic, etc.)
  • No Cloudflare-side charge for using Gateway

Optional Cloudflare Workers

  • Pair with Workers Paid (US$5/mo) for deeper integration

How it compares to alternatives

AspectCloudflare AI GatewayVercel AI GatewayPortkeyHeliconeLiteLLM
Free tierFree (no cost)Limited freeLimited freeLimited freeFree (self-hosted)
Multi-provider supportYesYesYesYesYes
ObservabilityYesYesStrongStrongSelf-hosted
CachingYesYesYesYesYes
Best forCloudflare users + free / cheapVercel usersEnterprise observabilityPure observabilitySelf-hosted control

For free observability + cost-control across AI providers, Cloudflare AI Gateway is hard to beat.


Privacy / data handling

  • Cloudflare proxies your requests to providers; standard Cloudflare privacy
  • Underlying AI provider’s privacy posture is what determines whether your data trains models
  • Cloudflare can cache responses (configurable); cached data is encrypted at rest
  • For sensitive data, prefer no-caching configuration + ensure provider has no-train tier

Recent changes

  • 2026: Expanded provider catalog; deeper analytics
  • 2024: Initial launch

Gotchas

  • Caching has implications β€” for personalised responses, caching wrong things can leak user A’s response to user B; configure cache keys carefully
  • Cloudflare-as-intermediary is one more hop; latency adds (~50ms typically)
  • For high-volume production with strict latency SLAs, direct provider calls or Vercel AI Gateway / Portkey may suit better
  • Provider authentication still happens (your API keys to providers still required); Gateway is in addition, not a replacement for provider accounts

See also


Sources