🇺🇸 United States · Deepgram — Real-Time Speech AI
Status: 🟩 COMPLETE 🟦 LIVING Section: 10 — AI and LLMs
| Vendor | Deepgram |
| Country/origin | 🇺🇸 United States (San Francisco) |
| Recommended for AUS? | ✅ Yes — US-based; SOC 2 Type II; HIPAA capable; strong speed advantage |
| Privacy summary | AWS hosting; SOC 2 Type II; HIPAA capable; GDPR compliant; audio not used for training; standard enterprise DPA |
| Free tier | ✅ $200 USD free credits on signup |
| Paid tiers | Pay-per-minute API pricing; volume discounts; enterprise |
| First released | 2015 (founded by ex-CERN physicists) |
| Last reviewed | June 2026 |
| Official site | https://deepgram.com |
What it is
Deepgram is a speech AI company known particularly for speed — its API is among the fastest commercial speech-to-text services available, making it ideal for real-time applications like voice agents, live captioning, and conversational AI.
Founded by physicists from CERN (the European particle physics laboratory) who applied their data processing expertise to building a speech recognition system from scratch — not just fine-tuning existing models.
Core capabilities:
- Nova-3: Deepgram’s flagship speech recognition model (fast and accurate)
- Real-time streaming transcription — sub-300ms latency in many cases
- Aura-2 (TTS): Deepgram’s text-to-speech for voice agents
- Voice Agent API: Complete platform for building real-time voice AI agents
- Diarization, sentiment, intent detection
- Custom models for domain-specific vocabularies
What you’d use it for (as a developer)
- Voice AI agents: Customer service bots, phone IVR systems with natural conversation
- Live captioning for events, video streaming, accessibility
- Real-time meeting transcription in collaboration apps
- Voice-driven applications where latency matters
- Customer service call analytics at scale
- Conversational AI products competing with Vapi, Bland AI
How to access from Australia
- Go to https://deepgram.com → Sign up free
- Sign up with email
- Get $200 USD in free credits on signup
- API keys in dashboard
- Use SDKs (Python, Node.js, Go, .NET, Rust)
Basic Python example:
from deepgram import DeepgramClient, PrerecordedOptions
deepgram = DeepgramClient(api_key="your-key")
options = PrerecordedOptions(model="nova-3", smart_format=True)
response = deepgram.listen.rest.v("1").transcribe_url(
{"url": "audio-url"}, options
)
print(response.results.channels[0].alternatives[0].transcript)What it costs
| Service | Price | Notes |
|---|---|---|
| Nova-3 transcription | ~$0.0043/minute | ~$0.26/hour |
| Real-time streaming | ~$0.0059/minute | Live audio |
| Aura-2 TTS | ~$0.030/1,000 characters | Voice synthesis |
| Voice Agent | Combined pricing | Full voice agent |
For comparison: a 1-hour podcast transcription on Nova-3 ≈ 0.40 AUD. Among the cheapest cloud STT options.
How it compares to AssemblyAI
The two leading specialised speech AI platforms compete directly:
| Aspect | Deepgram | AssemblyAI |
|---|---|---|
| Speed | ✅ Fastest | Fast |
| Real-time use | ✅ Specialised | Good |
| Voice Agents | ✅ Native platform | Via LeMUR + integration |
| Audio Intelligence | 🟡 Has features | ✅ Stronger Suite |
| TTS | ✅ Aura-2 | Partner integrations |
| Pricing | ✅ Often cheaper | Slightly higher |
| Accuracy on conversational audio | Excellent | Excellent (slight edge) |
| Free tier | $200 credit | $50 credit |
Decision rule:
- Real-time voice agents → Deepgram (their specialised platform)
- Audio analysis, summarisation, content intelligence → AssemblyAI (Audio Intelligence + LeMUR)
- Cost-sensitive large volume → Deepgram (typically cheaper)
- Accuracy-critical on tough audio → either; benchmark both
Voice Agent API — the key differentiator
Deepgram’s Voice Agent API (2024) is a complete platform for building voice AI:
- STT (Nova-3): Hears the user
- LLM integration: OpenAI, Anthropic, etc. (you choose)
- TTS (Aura-2): Speaks the response
- Conversation orchestration: Turn-taking, interruptions, barge-in
This is everything you need to build a voice agent like the ones at Vapi, Bland AI, Retell — but as raw infrastructure for builders.
For comparison, see real-time-voice-ai for the consumer-facing voice products.
Privacy considerations
- HIPAA capable with BAA
- Audio not used for training Deepgram’s models
- Configurable data retention — enterprise customers can have zero retention
- AWS US hosting primarily; some EU options
For Australian deployments:
- Standard enterprise DPA addresses APP 8 cross-border disclosure
- Disclose AI processing in privacy policies
- Recording consent under Australian state laws
- For healthcare: HIPAA + Australian Privacy Act sensitive information requirements
Australian considerations
- Strong Australian accent handling — Nova-3 performs well on Australian English
- US data hosting — Australian latency for real-time use adds ~150ms vs local, but Deepgram’s processing speed compensates substantially
- Voice agents for Australian businesses: Deepgram is the infrastructure choice for many Australian voice AI deployments
Gotchas
- Real-time pricing is per minute of audio. A 24/7 voice agent listening continuously gets expensive — model your costs carefully.
- Voice Agent platform requires development. It’s not a no-code tool — you write the code that uses Deepgram as infrastructure.
- Custom models cost more. Domain-specific custom models have higher per-minute pricing.
- Speed advantages depend on usage pattern. For batch transcription of recordings, Whisper API may be cheaper. Deepgram’s edge is real-time.
- Free $200 credits expire. Use them within reasonable time after signup.
See also
- assemblyai — main competitor with different strengths
- whisper — OpenAI’s alternative
- speech-to-text — STT overview
- real-time-voice-ai — consumer voice AI products
- elevenlabs — competitor for TTS specifically
- voice-synthesis — TTS overview
Sources
- Deepgram documentation: deepgram.com/docs
- Nova-3 announcement and benchmarks (2024)
- Deepgram Voice Agent API launch (2024)
- Independent benchmarks: ArtificialAnalysis.ai (2024-2026)
- TechCrunch coverage of Deepgram funding and growth (2022-2024)