Amazon Aims Its AI Chips Straight at Nvidia • Stephen Van Tran

The arms dealer steps onto the battlefield

Amazon just signaled that it wants to sell weapons, not just rent them.

For a decade, Amazon Web Services built custom chips for one customer: itself. That posture broke this month. AWS confirmed it is in early talks to sell its Trainium AI accelerators directly to outside data-center operators, a move that would turn an internal cost center into a merchant-silicon business aimed squarely at Nvidia (TechCrunch). AWS AI chief Peter DeSantis confirmed the exploration in a June 18 interview, and an AWS spokesperson reiterated to TechCrunch that the company is now entertaining requests it spent years declining. The shift is small in words and enormous in implication.

The financial backdrop makes the pivot serious rather than speculative. Amazon’s custom-silicon division — Trainium accelerators, Graviton CPUs, and Nitro networking chips — crossed a $20 billion annual revenue run rate in the first quarter of 2026, growing at triple-digit rates year over year (StartupHub). CEO Andy Jassy went further in his April shareholder letter: if the chip unit were a standalone company selling to AWS and third parties the way merchant vendors do, its run rate would be roughly $50 billion, a figure that would rival Intel’s entire annual revenue. That is not a science project. That is a top-five chipmaker hiding inside a retailer.

The stakes are structural. Nvidia owns the AI accelerator market the way Standard Oil once owned refining — roughly 80% share and a data-center business that hit $193.7 billion in fiscal 2026 (Silicon Analysts). Every hyperscaler that builds its own silicon chips away at that dominance from inside its own walls. But selling chips externally is a different act of war. It means Amazon would compete with Nvidia for the same third-party customers, on the same showroom floor, with a part it claims delivers four times the performance of its prior generation at half the cost of a conventional GPU.

The timing also separates this story from the month’s louder headlines. June has been dominated by frontier-model launches, IPO whispers, and talent raids. This is a quieter, deeper signal about who controls the substrate beneath all of it. When I wrote about the data-center power crunch reshaping the AI grid, the bottleneck was electricity. The bottleneck underneath that is the accelerator itself — and Amazon just announced it wants to sell the bottleneck to everyone, not hoard it.

Here is the thesis in one line: the AI economy is shifting from a single arms dealer to a cartel of vertically integrated suppliers, and Amazon’s external-sales gambit is the moment the cartel stops being polite. The question is no longer whether custom silicon can dent Nvidia. It is whether the hyperscalers will turn their captive chip programs into open marketplaces — and what that does to pricing, margins, and the software lock-in that has protected Nvidia for fifteen years.

Follow the silicon, find the margin

The numbers behind Amazon’s confidence are blunt and stackable.

Start with the part itself. Trainium3, which began shipping in late 2025, delivers roughly four times the performance of Trainium2 at about half the cost of a comparable GPU, and it has run near full capacity since launch (CTOL Digital). Demand is not the constraint — supply is. Amazon has disclosed multi-gigawatt commitments from frontier labs, including roughly 2 gigawatts of capacity from OpenAI and up to 5 gigawatts from Anthropic, the latter a partner Amazon has backed with billions in investment (About Amazon). When your biggest AI customers are reserving power measured in gigawatts, the chip is no longer an experiment. It is infrastructure.

Now layer the market structure. Hyperscaler custom silicon — Google’s TPU, AWS Trainium, Microsoft’s Maia, Meta’s MTIA — collectively sits near 15% to 20% of the accelerator market and is climbing fast, with the category compounding at a 44.6% annual rate as inference workloads, now roughly two-thirds of all AI compute, reward purpose-built efficiency over general-purpose GPUs (Introl). Broadcom, the silent partner behind much of this ASIC boom, forecasts $56 billion in AI revenue as custom-chip demand surges (Tech Times). The takeaway: the fastest-growing slice of the most important hardware market is the slice Nvidia does not control.

Amazon’s specific edge is that Trainium is no longer a niche. AWS says Trainium now processes more than half of the token throughput on Bedrock, its managed model platform — meaning the chip already carries production inference at scale, not just training benchmarks (Silicon Analysts). That matters because the external buyer Amazon is courting cares about one thing: will this rack run my workload reliably and cheaply today? A chip that already serves a majority of a major cloud’s inference traffic answers that question with evidence rather than slideware.

Here is an original way to frame the gap. Nvidia operates at roughly a $326 billion revenue run rate. Amazon’s standalone-equivalent chip run rate is about $50 billion. So Amazon, selling almost exclusively to itself, has already built a silicon business about 15% the size of Nvidia’s — without a single external customer, sales channel, or merchant price list. The external-sales move is not Amazon trying to enter the market. It is Amazon discovering it is already a top-tier player and deciding to monetize the surplus. Even capturing a tenth of Nvidia’s third-party demand would roughly double Amazon’s chip revenue.

The strategy also rhymes with Amazon’s oldest playbook. AWS itself was born by renting out internal infrastructure Amazon built for its own retail operations. Graviton, its Arm-based CPU line, followed the same arc and now underpins a large share of AWS compute. Selling Trainium racks externally is the third verse of the same song: build it for yourself, prove it at planetary scale, then open the doors and let the margin compound. The company that turned its warehouse logistics into a fulfillment business and its spare servers into a cloud empire is now eyeing its spare silicon capacity as the next platform.

The competitive geometry is what makes this dangerous for Nvidia. Google has kept its TPU largely captive, available through Google Cloud rather than as a merchant part. If Amazon breaks that convention and sells full Trainium racks to independent data centers — including the neoclouds and sovereign-compute projects multiplying worldwide — it normalizes the idea that you can buy frontier-grade accelerators from someone other than Nvidia. The discussions reportedly focus on selling complete racks rather than loose chips, which tells you Amazon understands the real product is an integrated system: silicon, networking, and the software to make them sing (Yahoo Finance). That is the same bundle Nvidia sells. The battle is now system versus system.

There is a demand-side reason the timing favors Amazon. The center of gravity in AI compute is shifting from training to inference, and inference now represents roughly two-thirds of all AI spending (Introl). Inference is where purpose-built ASICs shine: the workloads are more predictable, more cost-sensitive, and less dependent on the bleeding-edge flexibility that makes general-purpose GPUs expensive. A buyer running a steady production model does not need Nvidia’s full programmable surface area; it needs the lowest cost per token at acceptable latency. That is exactly the lane Trainium was designed for, and it is the lane growing fastest. The broader hyperscaler ASIC market reflects the same gravity, with custom chips from Google, Amazon, Microsoft, and Meta increasingly aimed at the inference tier rather than the training frontier (Hashrate Index).

Graviton is the proof of concept that should worry Nvidia most. Amazon’s Arm-based CPU line started as an internal cost play and grew into a mainstream option that now carries a large share of AWS compute, displacing Intel and AMD silicon inside the cloud without those incumbents ever getting a vote (About Amazon). The pattern is identical: build a credible alternative, deploy it at scale internally to drive down cost and prove reliability, then let economics pull customers toward it. Trainium is running that same play one layer up the value chain, in a market with far higher margins and far more strategic weight. If the Graviton arc repeats, Nvidia faces not a single challenger but a structural shift in how the largest buyers of compute think about silicon sourcing.

The ways this assault stalls out

The hardest truth about Amazon’s move is how easily it could amount to nothing.

Start with the word that defines the news: “talks.” DeSantis described early-stage exploration with no firm timeline, no named buyers, and Jassy’s own framing put external rack sales as much as two years away. Amazon has floated chip ambitions before and then retreated to the safety of its cloud model, where it captures the full margin and controls the customer relationship. A press cycle about exploratory discussions is not a product launch. Until a non-Amazon data center is running purchased Trainium racks in production, this remains a signal of intent, not a competitive fact.

Then there is the moat everyone underestimates until they trip over it: software. Nvidia’s true fortress is not its transistors — it is CUDA, the programming layer and ecosystem that fifteen years of developer muscle memory have made the default. Every framework, every kernel, every optimization tutorial assumes Nvidia. Trainium relies on AWS’s Neuron software stack, which is competent inside AWS but unproven as a portable, self-serve toolchain that an external customer can adopt without Amazon’s solutions architects holding the wiring. Selling a rack is easy. Selling the years of software maturity that make the rack productive is the hard part, and it is precisely where merchant challengers have died before.

Channel conflict is the third landmine. AWS’s entire pitch to AI labs is “rent our chips in our cloud.” Selling those same chips to data centers that compete with AWS — neoclouds, sovereign projects, even rivals — undercuts the cloud rental story. Why pay AWS’s cloud margin when you can buy the racks and run them yourself? Amazon would be cannibalizing its highest-margin business to chase a lower-margin hardware one. That tension may be exactly why Amazon spent years declining these requests, and it may be why the company keeps the timeline deliberately vague.

The macro counterpoint is the most sobering: Nvidia keeps winning even as its share erodes. Analysts expect Nvidia’s percentage share to drift from 80% toward 75% by the end of 2026, yet its absolute revenue still climbs because the total accelerator market is expanding faster than any single rival can capture — from roughly $160 billion in 2025 toward $200 billion-plus in 2026 (Silicon Analysts). In a market growing this fast, losing share and gaining dollars are not contradictions. Amazon can build a $50 billion silicon business and Nvidia can still grow. The pie is inflating faster than the slices are being redrawn.

There is also a concentration risk Amazon imports by selling externally. Today, Trainium’s customers are mostly Amazon and its closest partners. Open the doors and Amazon inherits the messy obligations of a merchant vendor: roadmap commitments to customers it does not control, support burdens, supply allocation fights, and the reputational cost of shipping a part that underperforms a buyer’s expectations. Nvidia’s $326 billion run rate buys an enormous support and ecosystem apparatus. Amazon would be building that muscle from a standing start while still serving its own insatiable internal demand — the same demand that has kept Trainium sold out and left little surplus to sell.

Finally, skeptics should note the gap between a benchmark and a balance sheet. “Four times the performance at half the cost” is a vendor claim, measured on workloads Amazon selects. Real buyers run heterogeneous fleets, and total cost of ownership includes migration, retraining, software porting, and the risk of betting a production stack on a second source. Even when the silicon is genuinely good, switching costs are sticky, and the rational enterprise move is often to keep Nvidia as the default and add Trainium at the margin. That hedging behavior caps how fast even a superior challenger can take share. The history of computing is littered with better chips that lost to good-enough incumbents with better ecosystems — Itanium, the Cell processor, and a graveyard of AI-accelerator startups all promised superior silicon and foundered on the software and tooling gap. Amazon’s advantage is that it does not need to win the market to win; it only needs a credible second source to exist, and it is already most of the way there.

Where the silicon wars go next

The direction of travel is clear even if the timeline is not: the AI hardware market is fragmenting from a monopoly into an oligopoly, and the operators who plan for that now will pay less and move faster than those who wait. Amazon’s external-sales trial is one data point in a broader inflection — Google, Microsoft, Meta, and OpenAI are all pushing custom silicon, and the merchant-versus-captive line is starting to blur. The economic prize is too large for it to stay a one-vendor market, and the strategic prize — not being hostage to a single supplier’s allocation and pricing — is too valuable for any hyperscaler to ignore.

The deeper story is vertical integration as the new default. The labs want their own chips, the chipmakers want their own clouds, and the clouds want their own models. I traced one version of this when Bezos backed a $41 billion physical-AI bet, and another when the SpaceX-xAI compute story collided with capital markets. Amazon selling Trainium externally is the same gravitational force expressed in silicon: everyone is trying to own more of the stack, because owning the stack is the only durable hedge against another player’s pricing power. The most original takeaway from stitching these threads together: in 2026, the moat is no longer the model or the chip in isolation — it is controlling the full vertical from electrons to tokens, and Amazon may be closer to that than any company except Nvidia and Google.

For operators, builders, and investors trying to act on this, here is the checklist:

Treat Trainium as a live procurement option, not a thought experiment. With Trainium3 carrying a majority of Bedrock inference, the part is production-ready inside AWS today. Benchmark your inference workloads on it now so you have real cost-per-token data before external racks ever ship.
Audit your CUDA dependency. The reason Nvidia’s lead persists is software lock-in. Inventory how much of your stack assumes CUDA, and pilot at least one workload on a non-Nvidia toolchain — Neuron, ROCm, or TPU — so you retain leverage when allocation gets tight.
Model the second-source discount. Use the credible threat of Trainium, TPU, and AMD to negotiate Nvidia pricing. Even if you never switch, a viable alternative is the cheapest leverage you will ever buy.
Separate the headline from the timeline. Amazon’s external sales are exploratory and possibly two years out. Plan capacity on what ships today; treat merchant Trainium as upside, not a baseline assumption.
Watch the rack, not the chip. The real product is the integrated system — silicon, networking, and software. Evaluate challengers on full-rack TCO and operational maturity, not peak FLOPS on a vendor’s favorite benchmark.
Track the channel-conflict tell. If Amazon commits to firm external pricing and named buyers, it signals confidence that hardware margin beats cloud-rental margin — a structural shift worth repositioning around. Continued vagueness means the cloud model still wins internally.
Map the concentration risk in your supply chain. A market moving from one dominant vendor to four is healthier, but each hyperscaler-supplier carries its own roadmap and allocation politics. Diversify deliberately rather than swapping one dependency for another.

The line to remember: Nvidia still wins the dollar count, but it no longer wins the argument that there is only one place to buy frontier compute. Amazon just made sure of that — and once a market believes there is a second source, the incumbent’s pricing power starts to leak, one rack at a time.

In other news

Anthropic closes funding at a $965B valuation and files confidentially for an IPO — The Claude maker’s new round vaulted it past OpenAI’s private valuation and was paired with a confidential S-1, targeting a fall Nasdaq listing that could raise more than $60 billion. The surge tracks the enterprise momentum I covered when Anthropic overtook OpenAI in U.S. business adoption (Fortune).

OpenAI files confidentially for an IPO, eyeing a September debut — OpenAI submitted a confidential S-1 on June 8 with Goldman Sachs and Morgan Stanley, against a roughly $850 billion private valuation, with analysts expecting an opening-day cap above $1 trillion. A near-simultaneous OpenAI–Anthropic listing race would be the largest tech IPO event in years (CNBC).

Alphabet moves to raise ~$80B for AI compute, including $10B from Berkshire Hathaway — Google’s parent paired $30 billion in public offerings and a $40 billion at-the-market program with a $10 billion Berkshire private placement to fund infrastructure. Warren Buffett’s vehicle backing an AI capex build signals that even value investors now treat compute as critical infrastructure (TechCrunch).

China unveils a $295B, five-year national AI infrastructure plan — The state-directed buildout mandates at least 80% domestic technology and would route the bulk of capacity through China Mobile and China Telecom, effectively locking Nvidia and AMD out of the sovereign compute fabric. The plan lands as Nvidia reported zero Hopper data-center shipments to China last quarter (Capacity).

Google ships Gemini 2.5 Pro with Deep Think reasoning — Google’s flagship adds a “Deep Think” mode that runs parallel reasoning streams and pairs it with a 2-million-token context window, putting it at or near the top of public benchmarks. The cadence keeps pressure on OpenAI and Anthropic across science, math, and reasoning (Google).

Amazon launches expanded Anthropic capacity backing its chip bet — Amazon’s multibillion-dollar Anthropic relationship now anchors up to 5 gigawatts of Trainium capacity, tying its silicon roadmap directly to a frontier lab’s growth. The arrangement gives Amazon both a marquee reference customer and a hedge across the model and hardware layers (About Amazon).