Table of Contents
Apple has a Siri problem, and it just spent a billion dollars a year on someone else’s brain to solve it. In January 2026, Apple and Google formalized a multiyear partnership reportedly worth $1 billion annually to replace Siri’s aging intelligence layer with Google’s 1.2-trillion-parameter Gemini model — an eightfold leap from Siri’s current 150-billion-parameter architecture. The deal was supposed to deliver a reimagined, context-aware Siri through an iOS 26.4 update targeted for March 2026. Instead, on February 17, Apple released the iOS 26.4 beta without a single new Siri feature. The most expensive AI upgrade in consumer technology history is running behind schedule, and the competitive implications are brutal. Samsung has already shipped its Galaxy S26 with Google’s Gemini fully operational. Google itself is pushing Gemini into every Android surface it controls. Apple, the company that once defined what a smartphone assistant could be, is now the last major platform holder to deliver one that actually works.
The stakes extend far beyond embarrassment. Apple’s install base of 2.35 billion active devices represents the largest captive audience in consumer technology, and the company’s services revenue — which topped $26 billion in Q1 2026 — depends increasingly on Siri functioning as the intelligent gateway between users and the subscription economy. Every day Siri remains a glorified timer-setter is a day Apple leaves billions in potential engagement on the table. The billion-dollar question is whether Apple’s legendary control over its ecosystem will survive the reality of outsourcing its most intimate user interface to its oldest rival, and whether the delay signals engineering caution or something structurally more concerning. As we explored in our earlier analysis of Apple’s plan to make Siri a full AI chatbot, the ambition has always been enormous. The execution is where the story gets complicated.
A billion-dollar brain transplant and the surgery that won’t end
The architecture of the Apple-Google deal reveals a transaction far more complex than a standard licensing agreement. Apple is not simply plugging Gemini into Siri the way you might swap a graphics card. The company negotiated a custom version of Google’s Gemini model specifically designed to run within Apple’s Private Cloud Compute environment — a secure server infrastructure built on Apple Silicon that acts as a privacy buffer between users and Google’s systems. Under this arrangement, Gemini handles Siri’s most computationally demanding tasks: the summarizer function that synthesizes information from across the user’s device, and the planner function that decides how to execute complex, multi-step requests. Simpler queries continue to run on Apple’s in-house models directly on the device.
This three-tier inference architecture — on-device for basic tasks, Apple’s Private Cloud Compute for intermediate processing, and Google’s Gemini for the heaviest lifting — is an engineering marvel on paper. Apple confirmed in late January that all Siri interactions routed to Gemini are anonymized, processed on sealed Apple Silicon server nodes with stateless runtime controls, and never stored or used to train Google’s models. The privacy guarantees are real and architecturally enforced rather than policy-based — a meaningful distinction that separates Apple’s approach from every other major AI assistant on the market. Data processed through Private Cloud Compute runs on hardware that uses attestation checks and cryptographic verification to ensure that no Apple employee, no government subpoena, and no rogue insider can access the content of a query in transit. The system is designed to minimize the amount of data shared with any cloud layer in the first place, processing the majority of interactions entirely on the device’s Neural Engine before escalating only the most complex requests up the stack.
But architectural elegance does not guarantee shipping dates, and the complexity of coordinating three inference layers across billions of devices appears to be exactly where the schedule fell apart.
Bloomberg reported on February 11 that Apple’s internal testing had uncovered significant reliability issues: Siri sometimes fails to process queries correctly and takes too long to respond to requests that involve cross-app actions. The on-screen awareness feature — which allows Siri to see and understand what is displayed on the user’s screen — works inconsistently across different app contexts. Image generation and web search capabilities, which were being tested internally for inclusion in iOS 26.4, may be pushed entirely to iOS 26.5 in May or iOS 27 in September. Apple quickly reaffirmed that the revamped Siri will still launch in 2026, but the company has never publicly committed to a date more specific than “this year” — a rhetorical safety valve that allows for a staggered rollout stretching into autumn.
The financial subtext is equally telling. During Google’s Q4 2025 earnings call in early February, analysts pressed management for details on the Apple deal’s revenue impact, signaling that Wall Street views the partnership as a material event for both companies. For Google, the arrangement is straightforward: it gets paid a billion dollars a year while simultaneously establishing Gemini as the default intelligence layer across the two largest smartphone ecosystems on Earth. For Apple, the calculus is more precarious. The company is simultaneously paying its most formidable competitor for the core technology powering its most visible product feature while betting that the white-label arrangement — no Google branding, no visible Gemini references — will preserve the illusion of Apple’s AI independence. The deal is Siri’s brain transplant, and the donor is the same company that spent two decades trying to replace Apple in the consumer technology hierarchy.
Here is the proprietary quantitative insight that no single press release captures: if you combine Samsung’s target of 800 million Gemini-equipped mobile devices with Apple’s 2.35 billion active devices running a Gemini-powered Siri, Google’s AI model will influence the assistant experience on more than 3 billion consumer endpoints by the end of 2026. That is roughly 37% of the global population, and it means Google has quietly achieved what no antitrust regulator explicitly approved: default AI distribution across both of the world’s dominant mobile ecosystems. Whether you hold an iPhone or a Galaxy, the intelligence answering your questions will be Gemini. The branding changes. The brain does not.
The invisible strings attached to Google’s brain
The most immediate counterargument to the Apple-Google partnership is the one that nobody at Apple Park wants to discuss publicly: dependency. Apple has built its entire brand identity on vertical integration — the thesis that controlling hardware, software, and services end-to-end produces a superior product experience. The Gemini deal fundamentally breaks that chain. If Google decides to prioritize its own products over Apple’s custom Gemini deployment, Apple has no recourse beyond the contractual terms of the partnership agreement. If Google’s model quality stagnates or is surpassed by a competitor, Apple cannot simply retrain its way out of the problem because it does not own the underlying model architecture. The company that famously refused to depend on Samsung for displays and Intel for chips has voluntarily placed its most important software feature in the hands of a rival.
The regulatory landscape adds additional risk. Japan’s Mobile Software Competition Act, which took effect in late 2025, already requires Apple to allow users to set third-party voice assistants as the default on the iPhone’s side button. The European Union’s Digital Markets Act imposes similar interoperability requirements across the bloc. If Siri’s Gemini-powered upgrade arrives late while competitors like Amazon’s Alexa+ and Google’s own first-party assistant continue to improve, Apple faces the perverse scenario of spending a billion dollars a year to make Siri competitive only to have regulators ensure that users can easily replace it with something better. The window of opportunity for a transformative Siri relaunch is measured in months, not years, and every delay narrows it further.
Samsung’s strategy sharpens the competitive threat. The Galaxy S26, launched in late February with a triple-engine AI architecture, routes different types of queries to different AI backends: Google Gemini for agentic tasks like booking rides and acting across apps, Perplexity for web-based research queries, and Samsung’s upgraded Bixby for on-device processing. This modular approach means Samsung can swap or upgrade individual components without overhauling the entire system — a flexibility that Apple’s tightly integrated Gemini architecture does not currently offer. Samsung has become the single most important distribution channel for Google’s consumer AI, and it is shipping features today that Apple has not yet delivered even in beta form. The gap is not large in absolute terms, but it is symbolically devastating for a company that spent a decade positioning itself as the premium technology brand.
The delay also raises questions about Apple’s internal AI capabilities. The company has invested heavily in on-device machine learning through its Neural Engine and Apple Silicon custom chips, a strategy we examined in detail in Apple’s AI endgame running on the desk rather than the cloud. But the Gemini partnership implicitly acknowledges that Apple’s in-house large language model research has not kept pace with the frontier labs. Apple’s existing 150-billion-parameter model, while competent for tasks like text summarization and basic question answering, lacks the reasoning depth, context window, and multimodal capabilities that users now expect from an AI assistant. The decision to license Gemini rather than build an equivalent model internally is a pragmatic concession to reality — but it is also an admission that the most valuable company on Earth could not build the most capable AI model on Earth, and had to buy access from a competitor instead.
The talent dynamics underscore the challenge. Google DeepMind employs thousands of researchers who have spent years building Gemini’s architecture, training pipelines, and reinforcement learning systems. Apple’s machine learning team, while strong in areas like on-device inference optimization and computational photography, has never demonstrated the capacity to train a frontier-scale foundation model from scratch. The company’s AI research publications, while respected, are a fraction of the volume produced by Google, OpenAI, or Anthropic. Apple chose to buy rather than build because the alternative — spending three to five years and billions of dollars assembling a competitive research organization from scratch — would have left Siri irrelevant by the time the model was ready. The Gemini deal is a calculated trade: accept dependency today to achieve competitiveness tomorrow, and hope that the terms of the partnership remain favorable long enough for Apple’s own AI capabilities to mature.
The privacy architecture, while genuinely impressive, introduces its own set of concerns. Private Cloud Compute ensures that user data is processed on sealed Apple Silicon servers and never stored, but it does not eliminate the fundamental reality that Apple’s most sensitive user interactions — queries about health, finances, relationships, and daily routines — are now being processed by a model built by Google. The anonymization layer strips personally identifiable information before queries reach Gemini, but the model still processes the semantic content of those queries to generate responses. Security researchers have already begun questioning whether the stateless runtime design can withstand sophisticated side-channel attacks, and any future breach — however unlikely — would carry existential reputational risk for a company whose brand is built on privacy as a fundamental human right.
What March’s stumble means for the rest of the year
The timing of Apple’s Siri delay coincides with one of the most aggressive hardware launch weeks in the company’s history. On March 4, Apple is expected to unveil a refreshed MacBook Air with the M5 chip, an iPad Air with M4, and the iPhone 17e — three products designed to expand Apple Intelligence to the broadest possible range of price points. The base iPad with its A18 chip will support Apple Intelligence for the first time, bringing AI features to Apple’s most affordable tablet. The M5 MacBook Air promises a 15% CPU and 30% GPU improvement over M4, with a 10-core Neural Engine optimized for local AI inference. Every one of these devices is being marketed on its AI capabilities, yet the flagship AI feature they are supposed to showcase — the reimagined Siri — will not be ready when customers open the box.
This creates a peculiar marketing paradox. Apple will sell millions of “AI-ready” devices in March and April whose most visible AI feature remains the old Siri: the same assistant that fumbles multi-step commands, cannot remember context between requests, and lacks the on-screen awareness that was supposed to be its defining upgrade. The iPhone 17 series, which drove $85.3 billion in revenue last quarter thanks to a strategic jump to 12GB of RAM across the Pro lineup specifically to handle Apple Intelligence workloads, was marketed as the first iPhone generation designed from the silicon up for AI. Yet buyers who purchased the hardware on that promise are still waiting for the software to justify it. Apple is essentially asking customers to buy hardware today on the promise of software tomorrow — a strategy that works only if “tomorrow” arrives before the novelty of the purchase wears off and before Samsung and Google capture the attention of cross-shopping buyers.
The two-phase rollout plan offers a rough timeline for what comes next. Phase 1, now likely targeting iOS 26.5 in May rather than 26.4 in March, is expected to deliver the core Siri upgrades: personal context awareness, on-screen understanding, and deeper per-app integration. Phase 2, arriving with iOS 27 in September, promises full conversational AI capabilities — the ability to sustain extended multi-turn dialogues, reason through complex requests, and coordinate actions across multiple applications simultaneously. This phased approach mirrors the strategy Apple employed when it launched Apple Intelligence itself in late 2024, dripping features across successive software updates rather than delivering a complete product on day one. The approach reduces technical risk but creates a competitive vacuum that rivals are eager to fill.
For developers, enterprise buyers, and investors watching the AI assistant space, the key metrics to track over the next six months are not benchmarks but adoption curves. When Siri’s Gemini-powered features finally launch, Apple will control whether they appear to 100% of eligible devices immediately or roll out gradually by region and device tier. The speed of that rollout will determine whether the Gemini partnership generates meaningful engagement data quickly enough to justify the billion-dollar annual investment, or whether the delayed launch has already allowed Samsung and Google to establish usage patterns that are difficult to reverse. The winner of the AI assistant race will not be the company with the best model — it will be the company that converts the most users into daily active AI engagees, and right now Apple is not even on the starting line.
The broader lesson of Apple’s Siri saga is that money cannot buy execution speed in AI. A billion dollars a year buys you access to the most capable model on the planet, but it does not buy you the systems engineering, quality assurance, and cross-platform integration work required to ship that model reliably to 2.35 billion devices. Apple’s delay is not a failure of ambition or investment — it is a failure of the implicit assumption that AI is a component you can bolt onto an existing product rather than a capability that must be woven into every layer of the software stack from the ground up. The companies winning the AI assistant race — Google with Gemini natively integrated into Android, Samsung with its triple-engine architecture shipping on day one — built their AI strategies around the model from the start rather than retrofitting a legacy system with someone else’s intelligence after the fact. Apple chose the latter path, and the schedule is telling the truth about how much harder that path is.
- Watch the May iOS 26.5 update: The first Siri features to ship will reveal how much of the original vision survived the testing gauntlet and whether Apple has prioritized reliability over feature completeness.
- Track Samsung’s Gemini engagement numbers: If Galaxy S26 users adopt AI assistant features at scale before Apple ships, the competitive window narrows dramatically regardless of Siri’s eventual quality.
- Monitor Apple’s WWDC 2026 keynote in June: The event will reveal how much of Phase 2 (iOS 27, full conversational Siri) is ready for developer preview, and whether Apple’s in-house model investments are closing the gap with Gemini.
- Follow the regulatory calendar: Japan and EU rules allowing third-party default assistants will shape whether Siri’s relaunch faces a competitive market or a protected one, with March-to-September policy enforcement timelines running in parallel with Apple’s software schedule.
- Evaluate Google’s leverage: Every earnings call from Alphabet will now include questions about the Apple deal’s financial contribution. If Google begins extracting more favorable terms at renewal, the power balance in the partnership will shift visibly.
In other news
AMD expands AI PC lineup at MWC 2026 — AMD announced the Ryzen AI 400 Series desktop processors at Mobile World Congress, combining Zen 5 cores with a dedicated XDNA 2 NPU for local AI acceleration. The mobile variants deliver up to 30% faster multithreaded performance than competing processors, with OEM partners Dell, HP, and Lenovo expected to ship workstations by Q2 2026.
Zhipu AI ships GLM-5, China’s first frontier model on domestic chips — The Beijing-based lab released a 744-billion-parameter mixture-of-experts model trained entirely on Huawei Ascend chips using the MindSpore framework, achieving 50.4% on Humanity’s Last Exam and 77.8% on SWE-bench Verified. GLM-5 is expected to ship under the MIT license, marking a significant milestone in China’s effort to build frontier AI without Nvidia silicon.
Nvidia Vera Rubin enters full production — Nvidia’s next-generation AI system combines 72 Rubin GPUs with 36 Vera CPUs in a single rack delivering 10 times the performance per watt of Grace Blackwell, with estimated rack pricing between $3.5 million and $4 million. AWS, Google Cloud, Microsoft, and OCI will be among the first to deploy Vera Rubin instances in H2 2026.
Snowflake and OpenAI forge $200 million enterprise AI partnership — Snowflake committed up to $200 million to purchase access to OpenAI’s frontier models and ChatGPT Enterprise, enabling Snowflake’s 12,600 customers to build and deploy context-aware AI agents grounded in their enterprise data without requiring coding experience.