Anthropic Bets Its Future on Running Your AI Agents
/ 16 min read
Table of Contents
From model seller to landlord
On April 8, 2026, Anthropic stopped being just a model company. The San Francisco AI lab launched Claude Managed Agents in public beta, a fully managed infrastructure service that lets developers define an AI agent in a few lines of configuration and deploy it to Anthropic’s cloud — complete with sandboxed code execution, persistent file systems, and built-in tools for bash commands, file operations, and web search. Developers pay standard token pricing for the underlying Claude model plus $0.08 per session-hour of active compute, billed down to the millisecond. Web search adds $10 per thousand queries. That is the entire pricing sheet. No reserved instances, no infrastructure negotiations, no DevOps team required.
The move marks the most consequential strategic shift in Anthropic’s history. Until today, the company sold intelligence by the token through its Messages API, leaving customers to build their own agent loops, sandboxes, and orchestration layers. Now Anthropic is selling the full stack: the model, the runtime, the execution environment, and the operational guardrails. It is the difference between selling engines and selling flights. And it positions Anthropic squarely against not just OpenAI, but every cloud provider and enterprise platform vendor fighting for control of the agentic AI market that Gartner predicts will touch 40 percent of enterprise applications by the end of 2026, up from less than 5 percent in 2025.
The timing is not accidental. Anthropic closed a $30 billion Series G at a $380 billion valuation in February, the second-largest venture deal in history. Its annualized revenue hit $14 billion at time of close and has since accelerated, with run-rate revenue reaching $30 billion by March — a 1,400 percent year-over-year increase fueled overwhelmingly by enterprise adoption. Eight of the Fortune 10 now use Claude. Over 500 customers spend more than $1 million annually, up from a dozen two years ago. The enterprise share of Claude Code revenue crossed 50 percent, and business subscriptions quadrupled since January. Anthropic is not launching Managed Agents because it needs a new product. It is launching Managed Agents because its enterprise customers are begging for one.
The competitive pressure is equally legible. OpenAI launched Frontier in February, an enterprise agent platform that connects to CRM systems, data warehouses, and internal tools — with HP, Oracle, State Farm, and Uber among early customers. Google has its Vertex AI Agent Builder and the Agentspace enterprise product. Microsoft’s Copilot Studio has enabled over 160,000 organizations to create more than 400,000 custom agents. The four largest platform players — Microsoft, Google, Salesforce, and AWS — collectively captured approximately 48 percent of agentic AI revenue in 2025. Anthropic’s response is to bypass the cloud middlemen entirely and offer the runtime itself.
The pain point Anthropic is targeting is real and measurable. Today, an enterprise team deploying Claude as an autonomous agent must build a container orchestration layer, implement sandboxed code execution with appropriate security boundaries, write context management and compaction logic to handle long-running tasks, build error recovery and retry mechanisms, create session persistence so tasks survive disconnections, and instrument the entire pipeline for observability. A typical implementation involves three to five engineers working for two to four months before the first production agent handles its first real task. The operational burden compounds: every model update, every new tool integration, every security audit cycles through the same custom infrastructure. Most enterprise AI teams spend more time maintaining the scaffolding around the model than building the workflows the model actually performs.
Here is the fundamental proposition: Anthropic claims Managed Agents can compress the journey from prototype to production deployment from months to days. The company is betting that enterprise teams would rather configure an agent through a YAML file and an API call than maintain Kubernetes clusters, build sandbox orchestration, and write their own context management logic. If that bet is right, every consulting engagement, every internal platform team, and every DevOps hire that exists to bridge the gap between “Claude can do this” and “Claude is doing this in production” becomes partially or fully redundant. That is a massive addressable market hiding inside the infrastructure bill of every AI-forward enterprise.
Four API calls to your first autonomous agent
The architecture beneath Managed Agents is built on four composable primitives that Anthropic describes in its quickstart documentation. An Agent defines the model, system prompt, tools, MCP servers, and skills. An Environment configures the cloud container — installed packages, network access rules, mounted files. A Session is a running agent instance inside that environment, performing a specific task and generating outputs. Events are the messages flowing between your application and the agent: user instructions in, status updates and tool results out, streamed over server-sent events.
To set up a working agent, you need exactly four API calls. First, install the SDK (pip install anthropic for Python) and set your API key as an environment variable. All Managed Agents endpoints require the managed-agents-2026-04-01 beta header, which the SDK handles automatically. Then create your agent:
from anthropic import Anthropic
client = Anthropic()
agent = client.beta.agents.create( name="Coding Assistant", model="claude-sonnet-4-6", system="You are a helpful coding assistant.", tools=[{"type": "agent_toolset_20260401"}],)The agent_toolset_20260401 tool type enables the full suite of built-in tools: bash command execution, file read/write/edit, glob pattern matching, grep search, web fetch, and web search. You can disable specific tools by passing a configs array, or flip the defaults and enable only what you need. Custom tools are also supported — you define the schema, Claude emits structured requests, your code executes them and sends results back into the session.
Next, create an environment that defines the container:
environment = client.beta.environments.create( name="production-env", config={ "type": "cloud", "networking": {"type": "unrestricted"}, },)The environment is a reusable container template. You can pre-install Python, Node.js, Go, or other packages, restrict network access to specific domains, and mount files the agent needs. Once defined, reference it across any number of sessions.
Start a session linking your agent to its environment:
session = client.beta.sessions.create( agent=agent.id, environment_id=environment.id, title="Data analysis task",)Finally, send a message and stream the agent’s response as it works:
with client.beta.sessions.events.stream(session.id) as stream: client.beta.sessions.events.send( session.id, events=[{ "type": "user.message", "content": [{"type": "text", "text": "Analyze sales.csv and generate a summary report"}], }], ) for event in stream: if event.type == "agent.message": for block in event.content: print(block.text, end="") elif event.type == "agent.tool_use": print(f"\n[Using tool: {event.name}]") elif event.type == "session.status_idle": print("\nAgent finished.") breakWhen you send a user event, the managed infrastructure provisions a container based on your environment configuration, runs Claude’s autonomous agent loop (where the model decides which tools to invoke based on your message), executes those tools inside the sandbox, streams events back to your application in real time, and emits a session.status_idle event when there is nothing left to do. The harness handles prompt caching and compaction automatically — performance optimizations that most custom agent implementations either skip or implement poorly, resulting in degraded quality on long-running tasks. Sessions persist through disconnections, meaning a network interruption does not kill a multi-hour analysis job. You can send additional user events to steer the agent mid-execution, or interrupt it to change direction entirely. Rate limits are set at 60 create requests and 600 read requests per minute per organization, with standard tier-based token limits applying underneath.
Anthropic also shipped the ant CLI alongside Managed Agents, allowing developers to create agents, environments, and sessions from the command line or through YAML configuration files. A typical YAML agent definition looks like this:
name: Code Review Agentmodel: claude-sonnet-4-6system: > You review pull requests for correctness, style, and security vulnerabilities. Be thorough but concise.tools: - type: agent_toolset_20260401 configs: - name: web_search enabled: false - name: web_fetch enabled: falseThat file becomes your agent’s single source of truth. Run ant beta:agents create < agent.yaml to deploy it. The CLI integrates natively with Claude Code, which means teams already using Claude Code for development can version-control their agent definitions alongside application code, review changes through standard pull request workflows, and deploy agents as part of their existing CI/CD pipeline. This is a meaningful design choice: rather than building a separate management console with a graphical drag-and-drop interface, Anthropic is meeting developers where they already work — in the terminal, in Git, in code review.
The enterprise adoption data is early but concrete. Notion deployed Custom Agents into workspaces, letting engineers ship code while knowledge workers generate presentations and websites — currently in private alpha. Rakuten stood up enterprise agents across product, sales, marketing, finance, and HR within one week per department, plugging into Slack and Teams to accept task assignments and return deliverables like spreadsheets and slide decks. Asana built what it calls AI Teammates — agents that pick up tasks and draft deliverables inside project management workflows. Sentry paired its existing Seer debugging agent with Claude-powered patch writing and pull request automation, shipping the integration in weeks rather than the months its team had initially estimated. In internal testing, Anthropic found that Managed Agents improved structured file generation success rates by approximately 10 percentage points compared to standard prompting approaches, with larger gains on complex tasks requiring multi-step orchestration.
Stitching together the pricing and performance data reveals an original insight: at $0.08 per session-hour plus Sonnet 4.6 token costs ($3 per million input, $15 per million output), a Managed Agent running a 30-minute code review task that consumes roughly 100,000 input tokens and 50,000 output tokens costs approximately $1.09 per session — the agent runtime itself adds only $0.04 to the $1.05 in token spend. For enterprise teams that previously maintained dedicated infrastructure to run Claude agents at scale, the infrastructure cost alone routinely exceeded $50,000 per month in engineering time and cloud compute. If Managed Agents eliminates even half of that overhead for a team running a thousand agent sessions daily, the annual savings approach $300,000 before accounting for faster iteration cycles. The runtime premium is a rounding error. The infrastructure savings are the product.
The lock-in question nobody in the beta is asking
Every platform convenience comes with a dependency cost, and Managed Agents is no exception. The moment you define your agent’s system prompt, tool configuration, and environment as Anthropic API objects, you are building on a proprietary runtime that has no open standard equivalent. Your agent definition lives in Anthropic’s system. Your session state persists on Anthropic’s infrastructure. Your tool calls execute in Anthropic’s containers. If pricing changes, rate limits tighten, or the beta introduces breaking changes — all plausible given the managed-agents-2026-04-01 beta header that every request requires — your production workloads are directly exposed.
The contrast with the competitive landscape is instructive. Google’s Agent2Agent (A2A) protocol, announced alongside its Vertex AI Agent Builder updates, explicitly targets multi-vendor agent interoperability — agents built on different platforms communicating through a shared standard. OpenAI’s Frontier positioned itself as an open platform compatible with agents from Google, Microsoft, and Anthropic, making vendor neutrality a selling point. Anthropic’s Managed Agents, by contrast, is a vertically integrated offering. Your agents run Claude models on Anthropic infrastructure using Anthropic tools. There is no escape hatch to swap in a different model provider or migrate your session state to a competitor’s runtime without rebuilding from scratch.
The beta designation itself introduces operational risk that enterprise teams must weigh carefully. Anthropic’s release notes state explicitly that “behaviors may be refined between releases to improve outputs,” which means the tool-use patterns, error-recovery logic, and streaming behavior your application depends on today could change without a deprecation cycle. Multi-agent coordination — arguably the feature most enterprise teams are excited about, where agents spawn and direct other agents to parallelize workloads — is still in research preview with access gated behind a request form. So are the outcomes system (which lets you define success criteria that the agent evaluates against), the memory capability (persistent knowledge across sessions), and the self-evaluation feature that lets agents assess and iteratively refine their own output. The features most likely to differentiate Managed Agents from a hand-rolled agent loop are the ones least likely to be stable.
There is also a data residency dimension. Enterprise customers in regulated industries — healthcare, financial services, government contracting — need to know precisely where their agent sessions execute, what data touches which jurisdiction, and how long artifacts persist after a session ends. Anthropic introduced data residency controls for inference in February 2026, allowing US-only execution at a 1.1x premium. But it remains unclear how those controls apply to the managed container environments that Managed Agents provisions. A financial services compliance team evaluating the platform will have legitimate questions about whether sandboxed code execution in a shared container environment meets SOC 2 and HIPAA requirements — questions the beta documentation does not yet fully address.
The build-versus-buy calculation is also more nuanced than Anthropic’s marketing suggests. Enterprise teams that have already invested six months building custom agent infrastructure — and many have, given that production agent deployments accelerated through late 2025 and early 2026 — face a sunk cost dilemma. Their existing systems handle edge cases specific to their domains, integrate with internal authentication and authorization layers, and have been battle-tested against their particular failure modes. Migrating to Managed Agents means abandoning that investment while potentially losing domain-specific reliability improvements that a general-purpose managed runtime cannot replicate. For teams starting fresh, the value proposition is obvious. For teams with existing infrastructure, the calculus depends on how much of their maintenance burden the managed runtime actually absorbs versus how much domain-specific logic they would need to rebuild on top of it.
The pricing risk is asymmetric. At $0.08 per session-hour, the runtime cost is negligible today. But enterprise software pricing has a well-documented pattern: attract with accessible rates during adoption, then extract as switching costs compound. Anthropic’s current enterprise LLM API market share stands at 40 percent — up from 24 percent — while OpenAI’s share dropped to 27 percent from 50. That market dominance gives Anthropic pricing power it has not yet exercised. The question is when, not whether. Teams building on Managed Agents should assume pricing will change and architect accordingly: abstract the agent definition layer so that the system prompt, tool schemas, and business logic are portable even if the runtime is not.
Your move: what operators should do before signing the session contract
The strategic picture is now clear. The next phase of AI competition is not about who has the best model — the frontier gap between Claude Opus 4.6, GPT-5, and Gemini Ultra has narrowed to single-digit percentage points on most benchmarks. The war is about who owns the orchestration layer: the runtime, the tools, the state management, and the developer patterns that teams build muscle memory around. Managed Agents is Anthropic’s bid to own that layer for the roughly $10.9 billion agentic AI market in 2026, projected to exceed $52 billion by 2030. Whoever controls how agents run controls where enterprise AI dollars flow for the next decade.
The early signals suggest Anthropic has the right product at the right moment. Rakuten deploying departmental agents in one-week sprints versus the months-long integration cycles typical of enterprise AI projects is exactly the compression ratio that accelerates adoption. Sentry shipping Claude-powered pull request automation in weeks instead of quarters demonstrates that the infrastructure abstraction is not merely theoretical. And Anthropic’s 70 percent win rate in head-to-head enterprise matchups against OpenAI means the underlying model quality is strong enough to withstand scrutiny from procurement teams who will inevitably benchmark alternatives.
But the platform’s success depends on variables outside Anthropic’s control. The pace of multi-agent standardization will determine whether enterprises can build cross-vendor agent workflows or remain locked into single-provider ecosystems. Regulatory frameworks for autonomous AI agents — particularly in the EU, where the AI Act’s high-risk classification could apply to agents making consequential business decisions — may impose compliance requirements that a beta platform cannot yet satisfy. And the open-source agent framework ecosystem, led by projects like LangGraph, CrewAI, and AutoGen, continues to mature rapidly. If those frameworks close the convenience gap with Managed Agents while preserving model portability, the value proposition of a proprietary managed runtime becomes harder to justify.
For operators evaluating Managed Agents today, here is what the data says you should do:
-
Start with contained, non-critical workflows. Rakuten’s playbook — deploy to one department, validate over a week, then expand — is the right pattern. Internal code review, document generation, and data analysis are ideal first targets because failures are recoverable and the value is immediately measurable.
-
Benchmark the cost against your current infrastructure. Calculate what you spend today on agent runtime infrastructure: cloud compute, container orchestration, engineering time maintaining sandboxes. If that number exceeds a few thousand dollars per month, Managed Agents likely pays for itself on infrastructure savings alone, even before you account for faster iteration speed.
-
Architect for portability from day one. Keep your system prompts, tool schemas, and business logic in version-controlled configuration files separate from the Anthropic-specific API calls. The
antCLI’s YAML-based agent definitions encourage this pattern. If you need to migrate to a different runtime later, the business logic should be lift-and-shift ready. -
Use the Messages API for latency-critical paths. Managed Agents excels at long-running, asynchronous tasks that run for minutes or hours. For real-time, user-facing interactions where sub-second latency matters, the direct Messages API remains the better choice. The official documentation explicitly positions these as complementary, not competitive.
-
Monitor the beta closely for breaking changes. Subscribe to Anthropic’s platform release notes and pin your agent definitions to specific tool versions. The
agent_toolset_20260401version string is your stability contract. When the next version ships, test against it in a staging environment before promoting to production. -
Evaluate multi-agent coordination cautiously. The research preview for spawning and directing child agents is the most exciting feature on the roadmap, but it is also the least mature. Treat it as an R&D experiment, not a production dependency, until Anthropic promotes it to public beta with documented stability guarantees.
Anthropic’s platform play is the clearest signal yet that the AI industry’s value chain is shifting from models to infrastructure. The company that made its name building the most thoughtful language model in the industry is now asking enterprises to trust it with something far more consequential: the operational runtime where AI agents make decisions, execute code, and interact with production systems. The IPO that both Anthropic and OpenAI are racing toward will ultimately be valued not on model benchmarks, but on platform revenue, retention, and the depth of enterprise dependency each company can build before the window closes. Managed Agents is Anthropic’s answer to that dependency question. Whether it becomes the AWS of AI agents or a cautionary tale about premature vertical integration depends entirely on whether the convenience is worth the commitment. For now, at eight cents an hour, the price of finding out is remarkably low.