modal

Status: 🟩 COMPLETE 🟦 LIVING Last updated: 2026-06-26 Plain-English tagline: Serverless GPU / CPU compute for AI — write Python locally, deploy to cloud GPUs with zero infrastructure setup. Pay per second of execution. The “Vercel of GPU workloads.”

Front-matter facts

Field	Value
Vendor	Modal Labs (San Francisco, USA)
Country / origin	🇺🇸 USA
Recommended for Australian users?	✅ Yes — fully accessible from AUS
Privacy summary	No training on customer data; your code runs in isolated containers
Free tier	US$30/month free compute
Paid tiers	Pay-per-second on top of free tier; Team / Enterprise quoted
First released	2022
Last reviewed	2026-06-26
Official site	https://modal.com

What it is

Modal is serverless GPU / CPU compute for AI workloads. You write Python functions locally, decorate them, and Modal runs them on cloud GPUs (Nvidia A100, H100, L40S, etc.) without you managing servers, containers, or infrastructure.

Example workflow:

import modal
 
app = modal.App("my-app")
 
@app.function(gpu="A100")
def generate_image(prompt):
    # runs on cloud A100 GPU when called
    # all dependencies handled automatically
    pass
 
@app.local_entrypoint()
def main():
    result = generate_image.remote("a kookaburra")

Modal handles:

Container building automatically
Dependency installation (pip / apt / etc.)
Cold starts (typically 1-10 seconds)
Auto-scaling (zero to thousands of concurrent executions)
Per-second billing

Use cases:

Run open-source AI models on demand without infrastructure
Batch processing at scale
Fine-tuning models with GPU access
Web apps / APIs with serverless GPU backends
Background jobs for AI processing

What you’d use it for

Self-hosted model inference on cloud GPUs (Llama, Mistral, Whisper, Stable Diffusion, etc.) without server management
Fine-tuning models on your data
Large-scale batch processing of AI workloads
Building AI products with serverless backend
Running specific open-source models not available on Together / Fireworks
Data science / ML research with cloud GPU access

How to use from Australia

Sign up at modal.com — US$30/month free compute
Install: pip install modal
Authenticate: modal token new
Write Python functions with @app.function() decorators
Deploy: modal deploy my_script.py
Call from local Python or expose as HTTPS endpoint
AUS card accepted

What it costs

Free tier

US$30/month free compute (substantial — covers many small projects)

Per-second pricing

CPU: very cheap (~US$0.0001/sec for small CPUs)
GPU T4: ~US $0.00018/ sec (U S$ 0.65/hour)
GPU A100 80GB: ~US $0.001/ sec (U S$ 3.60/hour)
GPU H100 80GB: ~US $0.0024/ sec (U S$ 8.60/hour)
Per-second billing means you only pay when functions actually execute

Storage

Some persistent volume storage included free
Additional storage charged per GB-month

Hidden costs

Long-running idle GPUs can add up if you mis-configure
Cold starts are fast but real (1-10s); design for them

How it compares to alternatives

Aspect	Modal	Lambda Labs	RunPod	CoreWeave	AWS GPU instances
Serverless (no provisioning)	Yes (best)	Limited	Limited	No (rent GPUs)	Limited
Per-second billing	Yes	Per-hour	Per-hour or per-second	Per-hour	Per-second
Auto-scaling	Best	Limited	Limited	Manual	Manual / auto-scaling
GPU types	A100 / H100 / T4 / L40S	Broad selection	Broad selection	Broad enterprise	Broad
Best for	Serverless AI workloads	Cheap rented GPUs	Cheap rented GPUs	Enterprise GPU clusters	AWS-stack

For developers wanting serverless GPU access without managing infrastructure, Modal is the cleanest option. For renting raw GPUs cheaply, RunPod / Lambda Labs.

Privacy / data handling

Code runs in isolated containers per request
No training on customer code
US data centres
For AUS data residency, AWS / Azure / GCP with AUS regions

Recent changes

2026: H200 / Blackwell GPU support
2025: Persistent volumes + sandbox improvements
2024: Major adoption among AI developers

Gotchas

Cold starts are 1-10s typically; design async workflows accordingly
Python-first — for non-Python AI work, less natural
Per-second billing means watch your idle time — don’t leave functions running idle
For inference of common models (Llama, Mistral), Together / Fireworks / Groq usually simpler than self-hosting on Modal
For Bible-Quest-scale projects, Modal is overkill — Vercel / Supabase covers most needs

Tech & AI, Explained

Explorer

modal

Front-matter facts

What it is

What you’d use it for

How to use from Australia

What it costs

Free tier

Per-second pricing

Storage

Hidden costs

How it compares to alternatives

Privacy / data handling

Recent changes

Gotchas

See also

Sources

Graph View

Table of Contents

Backlinks

Tech & AI, Explained

Explorer

modal

🇺🇸 USA · Modal

Front-matter facts

What it is

What you’d use it for

How to use from Australia

What it costs

Free tier

Per-second pricing

Storage

Hidden costs

How it compares to alternatives

Privacy / data handling

Recent changes

Gotchas

See also

Sources

Graph View

Table of Contents

Backlinks