🇺🇸 USA · Anthropic Computer Use

Status: 🟩 COMPLETE 🟦 LIVING Last updated: 2026-06-26 Plain-English tagline: Anthropic’s API capability that lets Claude see your screen and operate your computer — take screenshots, click buttons, type, scroll. The foundation for AI desktop agents.

Front-matter facts

Field	Value
Vendor	Anthropic (San Francisco, USA)
Country / origin	🇺🇸 USA
Recommended for Australian users?	✅ Yes — via Claude API, accessible globally
Privacy summary	Screen captures sent to Anthropic API for processing; no-training-by-default on API. Use sandboxed environments for sensitive workflows.
Free tier	API has free trial credit; production use is metered
Paid tiers	OpenAI API-style billing — pay per token; vision tokens cost extra
First released	October 2024 (initial public release)
Last reviewed	2026-06-26
Official site	https://docs.claude.com/en/docs/build-with-claude/computer-use

What it is

Computer Use is an Anthropic API capability that gives Claude the ability to:

Take screenshots of a computer screen (mouse + keyboard via the OS)
Click at specific coordinates
Type text
Scroll
Press keyboard shortcuts

In other words: Claude can operate a computer like a human can. The model receives the screenshot, decides what to do (click here, type there), the response is executed in your environment, the new screenshot is sent back, the loop continues.

Important framing: Computer Use is an API capability, not a finished product. Anthropic provides:

The model (claude-sonnet-4-X and others have Computer Use vision tuning)
API access to a special set of tools (computer_20250108, screenshot, etc.)
Reference container images for running it safely

You build the actual product that uses Computer Use. Many third-party products and Anthropic’s own offerings sit on top:

Claude in Chrome (Anthropic’s browser extension) — uses Computer-Use-style operation in the browser
Claude Code Computer Use mode — Claude Code can drive your desktop via Computer Use
Third-party agents — anyone can build computer-operating agents on the API

Analogy: Computer Use is the underlying “see and click” muscle for AI agents. ChatGPT’s Operator and Google’s Project Mariner are competing implementations of the same broad capability — driven by their own models.

What you’d use it for

Automating GUI workflows — applications that have no API but do have UIs
Web browsing as an agent — opening pages, filling forms, downloading files
Cross-application workflows — copy data from app A, paste into app B
Repetitive desktop tasks — anywhere “click here, type there” repeats
Testing UI — visual regression testing, accessibility audits
Building custom agents for proprietary internal applications

What you’d NOT use it for:

Tasks better done via API (always prefer API over UI automation when one exists)
High-security workflows on your main machine (use sandbox)
Real-time critical work (latency of screenshot loops adds up)

How to use it

Via the Anthropic API

Sign up at console.anthropic.com
Use a Computer-Use-capable model (Claude Sonnet 4.x, Opus 4.x, etc.)
Configure the computer_use tool in your API request
Run your agent in a sandboxed environment (Docker container with virtual display, or a dedicated VM, or a remote machine)
The loop: screenshot → model decides action → execute → screenshot → …

Via Claude Code’s Computer Use mode

In Claude Code, enable Computer Use via configuration
Claude Code can drive your local desktop within configured permissions
Recommended: use Anthropic’s reference Docker container for safety

Via third-party products

Multiple agent products are built on Computer Use under the hood — check the product, not the underlying capability

Reference implementation

Anthropic publishes a Docker container at anthropic-quickstarts/computer-use-demo on GitHub with a safe sandboxed Linux environment

What it costs — what you actually get

Anthropic API pricing

Standard token rates apply (input/output tokens)
Vision tokens are real — every screenshot adds image-input tokens; the cost scales with image resolution and frequency
Typical Computer Use session can run US$0.50-5.00 in API costs depending on complexity and image resolution

Cost-control techniques

Lower screenshot resolution when full-res isn’t needed
Limit loop iterations — set max steps
Cache static content — Anthropic supports prompt caching
Use cheaper Computer-Use-capable models when the task allows (Haiku 4.5 can do many Computer Use tasks)

Subscription bundling

Computer Use is NOT included in claude.ai Pro / Max — those are consumer subscriptions for chat
It’s an API-only capability for now

How it compares to alternatives

Capability	Anthropic Computer Use	OpenAI Operator / Agent	Google Project Mariner	Devin
Underlying capability	API + reference container	Product (cloud)	Product (browser)	Product (managed)
Open / self-hosted	Yes (reference Docker image)	No	No	No
Cost model	API token-based	ChatGPT Pro subscription	Google AI Ultra subscription	Per-Devin-instance
Primary surface	Anywhere you run it	chatgpt.com	Chrome browser	devin.ai
Maturity	Released Oct 2024	Generally available 2025	Preview 2024-25	Generally available 2024
For building your own agent	Best — full API + container	Limited	Limited	Limited

If you’re a developer building an AI agent product, Computer Use is the foundation you’d choose. If you’re a consumer wanting an out-of-the-box agent, OpenAI’s Operator (in ChatGPT Pro), Project Mariner (in Google AI Ultra), or Devin are more polished.

Privacy / data handling

Critical to understand: when Claude operates your computer, screenshots of whatever is on the screen are sent to Anthropic’s API. This includes:

Open browser tabs
Email contents visible
Other applications
Any sensitive data visible in any window

Best practice for safety:

Use a sandboxed environment — Anthropic’s reference Docker container, a dedicated VM, a remote machine. NOT your main daily-driver laptop with personal data visible.
Anthropic API does NOT train on Computer Use inputs by default — same no-training policy as the rest of the API
Permissions — limit what the sandbox has access to (no email login, no banking, no production credentials)
Inspect actions before execution for high-stakes operations — Anthropic’s reference implementation supports a “pause-and-confirm” mode

Where data lives: US data centres for standard API; AUS data residency via AWS Bedrock Sydney when running through Bedrock.

Recent changes

2026: Computer Use generally available across more Claude models; quality improvements
2025: Computer Use expanded; reference container and quickstart matured
October 2024: Initial public release of Computer Use capability

Gotchas

Run in a sandbox. This is the single most important safety practice. Don’t point Computer Use at your daily-driver desktop without careful permissions.
Vision tokens add up fast. Set a budget and monitor.
Latency is real — each screenshot-to-decision-to-execute loop takes seconds. Computer Use is for tasks where the alternative is “I’d do this manually,” not for real-time interaction.
Reliability varies by task — modern Computer Use is excellent for simple, well-defined GUI tasks; complex multi-app workflows can fail at unexpected steps.
Don’t combine with sensitive data — assume the model sees everything on screen; act accordingly.
Display resolution matters — Computer Use works best at standard resolutions (1280x800, 1366x768, 1920x1080). Very high-DPI or unusual aspect ratios can confuse coordinate translation.
OS-specific quirks — works on Linux (best supported), macOS, Windows; specifics differ.
Web-only workflows are often better with Claude in Chrome (which is purpose-built for browser) than with full desktop Computer Use.
Computer Use is API-only. No claude.ai-included consumer surface for it. Buy ChatGPT Pro for Operator or Google AI Ultra for Mariner if you want a consumer agent.

Tech & AI, Explained

Explorer

computer-use