🇺🇸 USA · Anthropic Computer Use
Status: 🟩 COMPLETE 🟦 LIVING Last updated: 2026-06-26 Plain-English tagline: Anthropic’s API capability that lets Claude see your screen and operate your computer — take screenshots, click buttons, type, scroll. The foundation for AI desktop agents.
Front-matter facts
| Field | Value |
|---|---|
| Vendor | Anthropic (San Francisco, USA) |
| Country / origin | 🇺🇸 USA |
| Recommended for Australian users? | ✅ Yes — via Claude API, accessible globally |
| Privacy summary | Screen captures sent to Anthropic API for processing; no-training-by-default on API. Use sandboxed environments for sensitive workflows. |
| Free tier | API has free trial credit; production use is metered |
| Paid tiers | OpenAI API-style billing — pay per token; vision tokens cost extra |
| First released | October 2024 (initial public release) |
| Last reviewed | 2026-06-26 |
| Official site | https://docs.claude.com/en/docs/build-with-claude/computer-use |
What it is
Computer Use is an Anthropic API capability that gives Claude the ability to:
- Take screenshots of a computer screen (mouse + keyboard via the OS)
- Click at specific coordinates
- Type text
- Scroll
- Press keyboard shortcuts
In other words: Claude can operate a computer like a human can. The model receives the screenshot, decides what to do (click here, type there), the response is executed in your environment, the new screenshot is sent back, the loop continues.
Important framing: Computer Use is an API capability, not a finished product. Anthropic provides:
- The model (
claude-sonnet-4-Xand others have Computer Use vision tuning) - API access to a special set of tools (
computer_20250108,screenshot, etc.) - Reference container images for running it safely
You build the actual product that uses Computer Use. Many third-party products and Anthropic’s own offerings sit on top:
- Claude in Chrome (Anthropic’s browser extension) — uses Computer-Use-style operation in the browser
- Claude Code Computer Use mode — Claude Code can drive your desktop via Computer Use
- Third-party agents — anyone can build computer-operating agents on the API
Analogy: Computer Use is the underlying “see and click” muscle for AI agents. ChatGPT’s Operator and Google’s Project Mariner are competing implementations of the same broad capability — driven by their own models.
What you’d use it for
- Automating GUI workflows — applications that have no API but do have UIs
- Web browsing as an agent — opening pages, filling forms, downloading files
- Cross-application workflows — copy data from app A, paste into app B
- Repetitive desktop tasks — anywhere “click here, type there” repeats
- Testing UI — visual regression testing, accessibility audits
- Building custom agents for proprietary internal applications
What you’d NOT use it for:
- Tasks better done via API (always prefer API over UI automation when one exists)
- High-security workflows on your main machine (use sandbox)
- Real-time critical work (latency of screenshot loops adds up)
How to use it
Via the Anthropic API
- Sign up at
console.anthropic.com - Use a Computer-Use-capable model (Claude Sonnet 4.x, Opus 4.x, etc.)
- Configure the computer_use tool in your API request
- Run your agent in a sandboxed environment (Docker container with virtual display, or a dedicated VM, or a remote machine)
- The loop: screenshot → model decides action → execute → screenshot → …
Via Claude Code’s Computer Use mode
- In Claude Code, enable Computer Use via configuration
- Claude Code can drive your local desktop within configured permissions
- Recommended: use Anthropic’s reference Docker container for safety
Via third-party products
- Multiple agent products are built on Computer Use under the hood — check the product, not the underlying capability
Reference implementation
- Anthropic publishes a Docker container at
anthropic-quickstarts/computer-use-demoon GitHub with a safe sandboxed Linux environment
What it costs — what you actually get
Anthropic API pricing
- Standard token rates apply (input/output tokens)
- Vision tokens are real — every screenshot adds image-input tokens; the cost scales with image resolution and frequency
- Typical Computer Use session can run US$0.50-5.00 in API costs depending on complexity and image resolution
Cost-control techniques
- Lower screenshot resolution when full-res isn’t needed
- Limit loop iterations — set max steps
- Cache static content — Anthropic supports prompt caching
- Use cheaper Computer-Use-capable models when the task allows (Haiku 4.5 can do many Computer Use tasks)
Subscription bundling
- Computer Use is NOT included in claude.ai Pro / Max — those are consumer subscriptions for chat
- It’s an API-only capability for now
How it compares to alternatives
| Capability | Anthropic Computer Use | OpenAI Operator / Agent | Google Project Mariner | Devin |
|---|---|---|---|---|
| Underlying capability | API + reference container | Product (cloud) | Product (browser) | Product (managed) |
| Open / self-hosted | Yes (reference Docker image) | No | No | No |
| Cost model | API token-based | ChatGPT Pro subscription | Google AI Ultra subscription | Per-Devin-instance |
| Primary surface | Anywhere you run it | chatgpt.com | Chrome browser | devin.ai |
| Maturity | Released Oct 2024 | Generally available 2025 | Preview 2024-25 | Generally available 2024 |
| For building your own agent | Best — full API + container | Limited | Limited | Limited |
If you’re a developer building an AI agent product, Computer Use is the foundation you’d choose. If you’re a consumer wanting an out-of-the-box agent, OpenAI’s Operator (in ChatGPT Pro), Project Mariner (in Google AI Ultra), or Devin are more polished.
Privacy / data handling
Critical to understand: when Claude operates your computer, screenshots of whatever is on the screen are sent to Anthropic’s API. This includes:
- Open browser tabs
- Email contents visible
- Other applications
- Any sensitive data visible in any window
Best practice for safety:
- Use a sandboxed environment — Anthropic’s reference Docker container, a dedicated VM, a remote machine. NOT your main daily-driver laptop with personal data visible.
- Anthropic API does NOT train on Computer Use inputs by default — same no-training policy as the rest of the API
- Permissions — limit what the sandbox has access to (no email login, no banking, no production credentials)
- Inspect actions before execution for high-stakes operations — Anthropic’s reference implementation supports a “pause-and-confirm” mode
Where data lives: US data centres for standard API; AUS data residency via AWS Bedrock Sydney when running through Bedrock.
Recent changes
- 2026: Computer Use generally available across more Claude models; quality improvements
- 2025: Computer Use expanded; reference container and quickstart matured
- October 2024: Initial public release of Computer Use capability
Gotchas
- Run in a sandbox. This is the single most important safety practice. Don’t point Computer Use at your daily-driver desktop without careful permissions.
- Vision tokens add up fast. Set a budget and monitor.
- Latency is real — each screenshot-to-decision-to-execute loop takes seconds. Computer Use is for tasks where the alternative is “I’d do this manually,” not for real-time interaction.
- Reliability varies by task — modern Computer Use is excellent for simple, well-defined GUI tasks; complex multi-app workflows can fail at unexpected steps.
- Don’t combine with sensitive data — assume the model sees everything on screen; act accordingly.
- Display resolution matters — Computer Use works best at standard resolutions (1280x800, 1366x768, 1920x1080). Very high-DPI or unusual aspect ratios can confuse coordinate translation.
- OS-specific quirks — works on Linux (best supported), macOS, Windows; specifics differ.
- Web-only workflows are often better with Claude in Chrome (which is purpose-built for browser) than with full desktop Computer Use.
- Computer Use is API-only. No claude.ai-included consumer surface for it. Buy ChatGPT Pro for Operator or Google AI Ultra for Mariner if you want a consumer agent.
See also
- Claude Code deep dive 🟩 🟦
- Claude Skills 🟩 🟦
- Claude Cowork 🟩 🟦
- Claude in Chrome 🟥 — browser-specific
- Claude Remote Desktop 🟥 — sandboxed cloud machine
- Claude Agent SDK 🟥
- Claude API overview 🟩 🟦
- Claude models 🟩 🟦
- Agent) 🟩 🟦 — competing implementation
- Astra) 🟩 🟦 — competing implementation
- Devin 🟥 — third-party autonomous agent
- Browser-use (open source) 🟥
- What “agents” really mean 🟥