ChatGPT ads, data, and the new AI SEO frontier • Stephen Van Tran

Hidden strings in an Android beta are not supposed to provoke existential questions about the web—yet that is exactly what happened when researchers found references to an “ads feature,” “bazaar content,” and a “search ads carousel” inside the ChatGPT app, as documented by BleepingComputer’s report on OpenAI’s internal testing of ChatGPT ads (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). For the first time, the most widely used AI assistant on the planet is explicitly experimenting with a business model that looks suspiciously like Google Search: sponsored placements woven into an interface that already feels like a neutral oracle. The leak is not just a product detail; it is a signal flare for a different attention economy.

The raw numbers are staggering. BleepingComputer notes that ChatGPT now sees roughly 800 million weekly users and handles about 2.5 billion prompts per day, with India already surpassing the United States as the single biggest user base for GPT-powered assistants (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). Even if OpenAI limited ads to the search experience at first—as the Android strings suggest—those volumes create an advertising surface that rivals early Google Search in both scale and commercial intent. The incentives are obvious: when you own an interface with that much behavioral data, someone will ask when it starts paying rent.

The uncomfortable question is not whether there will be ads in AI chat products; the leak all but answers that. The deeper question is whether ads are inevitable in a technology that is soaked in personal context: multi‑year chat histories, documents, codebases, voice, and soon video. My own view is that some form of monetization driven by this data richness is unavoidable, even with subscription tiers in place. The physics of compute costs, investor expectations, and competitive pressure do not leave enough room for purely subscription-funded frontier models at this scale. The more interesting debate is what form those ads take, how transparent they are, and how the rest of the web rearranges itself once AI agents—not blue links—become the layer we are optimizing for.

This piece tries to hold two ideas at once. First, that ads in AI assistants are coming faster than most of us would like to admit. Second, that we still have agency in shaping how those ads behave, how they intersect with privacy, and how the next generation of “SEO”—really, agent optimization—will work. The leak is a warning shot, but it is also an invitation to design something less extractive than the last two decades of search.

The leak that rewires expectations

The BleepingComputer investigation reconstructs a simple story from unglamorous evidence: inside version 1.2025.329 of the ChatGPT Android beta, researcher Tibor spotted new references to an “ads feature” tied to “bazaar content,” “search ad,” and a “search ads carousel” (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). Nestled among ordinary resource strings, these phrases point to a planned UI where search-style results coexist with promotional slots—likely above or alongside organic answers. For now, the code hints that ads will be limited to the search surface rather than standard conversational replies, but that distinction is thinner than it looks when your search bar and your chat window are the same box.

What makes this leak unsettling is not that OpenAI needs revenue. It is that ChatGPT has been positioned as the antithesis of ad‑ridden search. Up to this point, the core experience has been free at the point of use, with optional paid tiers for better models and higher limits. You do not see GPT pitching you sneakers or SaaS tools mid‑conversation the way search pages are laced with shopping modules and ad units. The implicit contract has been: your prompts fund model improvement, not a bid auction. The Android strings show that contract starting to bend.

The economics behind that bend are unforgiving. Training and serving state-of-the-art models costs hundreds of millions of dollars per year in compute and infrastructure, and that bill grows with usage. If you accept BleepingComputer’s figure of 2.5 billion prompts per day and assume even a modest cost per thousand requests, the daily burn adds up quickly (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). Subscription revenue from a fraction of power users can offset some of that, but it does not touch the free-tier traffic at global scale. When a system behaves like a public utility but is financed like a growth-stage startup, advertising becomes a nearly irresistible pressure valve.

In that light, the existence of an “ads feature” in a beta build feels less like a betrayal and more like an overdue alignment between usage and business model. If ChatGPT is handling 2.5 billion prompts per day, and OpenAI showed ads on just 5% of those interactions with a conservative $10 CPM, that would translate to roughly 125 million impressions daily and about $1.25 million in revenue per day—on the order of $450 million per year. That back-of-the-envelope math is not a forecast, but it illustrates a simple point: even a timid ad load on a conversational interface produces the kind of numbers that make board decks glow.

There is a second, more delicate axis: data. The BleepingComputer piece argues that GPT “likely knows more about users than Google,” because it sees not just search queries but long-form confessions, draft emails, code snippets, and decision trees (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). A search engine knows what you want to know; an assistant knows what you are trying to do. That distinction matters when you begin placing ads. A search ad might respond to a single intent (“best CRM for startups”); a ChatGPT ad can, in theory, respond to the entire context of your day, your project history, and your writing style. The targeting surface is not just richer; it is more intimate.

Against that backdrop, the question “Is this inevitable even with subscriptions?” feels almost rhetorical. We already see hybrid models everywhere else: streaming services offer ad-free and ad-supported tiers, productivity suites tuck promotions into free accounts, and even paid products cross‑promote within their own ecosystems. When a platform handles billions of daily interactions and sits on a mountain of behavioral data, ads are not a bug; they are the path of least resistance. The task ahead is not to pretend they will not appear, but to shape the constraints under which they do.

Why ads feel inevitable in an AI-first world

To understand why ads in AI assistants feel inevitable, you have to follow the gravitational pull of three forces: compute costs, data richness, and interface consolidation. Each one would be manageable on its own; together they push platforms toward monetization models that resemble search, even if the surface looks like a friendly chat.

First, compute. Large language models are not static products; they are rolling capital expenditures. Training new generations, running inference at low latency, maintaining retrieval pipelines, and shipping multi-modal features all draw from a pool of GPUs that is still scarce and expensive. As models like GPT-5 and beyond become default assistants embedded in browsers, operating systems, and IDEs, their usage spikes faster than subscription pricing can realistically track. Subscription fatigue is already a fixture of consumer tech; asking every user to pay $20–$30 per month for a baseline AI assistant is not a viable path to ubiquitous access. Ads, by contrast, scale effortlessly with volume.

Second, data richness. Google Search has always been a proxy for intent: users compress their needs into keywords, and advertisers bid on those signals. ChatGPT and its peers see something deeper. When a founder walks through a fundraising deck with an assistant, or a developer pastes in an error log from production, or a patient drafts a letter to an insurance company, the model sees multi-step context, emotional tone, and implicit urgency. Per BleepingComputer’s analysis, GPT’s vantage point likely already exceeds what traditional search engines can infer about a typical user’s goals (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). If you are designing an ad system, that context is gold—dangerous gold, but gold nonetheless.

Third, interface consolidation. The trend across the industry is away from discrete web pages and toward unified conversational surfaces. Microsoft is threading Copilot through Windows and Office; Google is merging search and generative answers; OpenAI is pushing ChatGPT into mobile, desktop, and third-party apps. As that consolidation accelerates, the “entry point” to the web for many tasks becomes a single prompt box. In the search era, an advertiser competed for a slot on a page of ten blue links. In the agent era, the competition is for influence over a synthesized answer spoken in a single authoritative voice. Without explicit ad rails and disclosure standards, that influence becomes opaque.

This is why the leak’s detail about ads being initially limited to the search experience matters. It suggests that OpenAI is attempting to preserve a conceptual boundary between “assistant conversation” and “search-like results,” even though both live in the ChatGPT app. That boundary may hold for a while. Over time, however, pressure will build to blur it: an enterprise customer will want onboarding flows sponsored, a commerce partner will push for product recommendations inside general-purpose queries, or an app developer will want promoted slots in an app store-like “bazaar” of agents. The strings referencing “bazaar content” feel like a preview of exactly that dynamic.

Once you accept that ads are coming, the real design challenge shifts from “whether” to “how.” How do we ensure that sponsored content is clearly labeled, grounded in relevant context, and fenced off from sensitive personal inferences? How do we prevent the assistant from steering users toward high-bid outcomes that are misaligned with their goals? And, crucially, how do we give users and content owners leverage over how their data is used in training and targeting? That last question takes us straight into the emerging world of AI-native SEO.

From SEO to agent optimization

For roughly twenty-five years, SEO was shaped by one central fact: human beings typed queries into search engines, and search engines returned ranked lists of web pages. The job of an SEO professional was to reverse‑engineer ranking factors, structure content to align with them, and build authority signals (links, engagement, technical hygiene) that caused their pages to surface more often. Ads were a layer on top of that system: pay for a shortcut, or earn your way up the organic ladder.

AI assistants fracture that pattern in two ways. First, they remix content from across the web into synthesized answers, often without exposing the underlying links. Second, they increasingly act on behalf of the user: booking appointments, drafting emails, calling APIs, or orchestrating tools. In that environment, the concept of “ranking a web page” gives way to something more subtle: being the source an agent trusts enough to summarize, cite, or call programmatically. Optimizing for that environment is less about title tags and more about machine readability, licensing signals, and explicit interfaces.

This is where llms.txt enters the picture. As Semrush explains, llms.txt is a proposed standard—a simple text file, similar in spirit to robots.txt—that lives at the root of a domain and tells large language models how they should crawl, use, and attribute a site’s content (https://www.semrush.com/blog/llms-txt/). Site owners can declare which sections are fair game for training, which should be used only for retrieval-style answers, and which are entirely off limits. Some implementations also let you specify preferred attribution formats or links to licensing terms. In other words, llms.txt is the beginning of a contract layer between content creators and AI systems.

In an ad-supported ChatGPT, that contract gains new dimensions. If your content is being used to answer commercial queries (“best AI analytics tool for midmarket SaaS”), you now care not only about whether the assistant understands and cites your work, but also about whether your brand appears in or around sponsored recommendations. A future “AI SEO” stack will have to manage at least three axes at once:

Training visibility. Do you allow models to learn from your content at all, and if so, under what terms? llms.txt and future successors are the knobs you can turn (https://www.semrush.com/blog/llms-txt/).
Retrieval performance. When a model is answering a question live, can it reliably find and use your content as a source? That hinges on structured data, clean information architecture, and stable URLs.
Agent interfaces. If an assistant can act, do you expose APIs or “skills” the model can call on your behalf—bookings, purchases, workflows—so that you are not just a paragraph in its answer but a tool in its arsenal?

Seen through that lens, the new frontier of SEO is less about persuading a ranking algorithm and more about becoming an attractive substrate for agents. You want your site to be easy for models to parse, clear about licensing, rich in structured entities, and equipped with endpoints that turn interest into action. That does not make classic SEO irrelevant—you still need crawlable pages, performant serving, and compelling content—but it shifts the center of gravity from human-first layout tricks to model-first clarity.

There is also a measurement challenge. Today, marketers have decades of tooling around search impressions, click‑through rates, and conversion funnels. In an AI-first world, the key questions become: How often do assistants mention my brand? In what contexts? With what sentiment? And when those mentions are accompanied by ads, how often am I the sponsor versus the organic recommendation? Answering those questions will require new kinds of analytics, new publisher partnerships with model providers, and probably new regulatory standards for transparency in AI‑mediated recommendations.

Internal linking will evolve as well. In a previous deep dive on GPT‑5 Codex Max, I argued that structured outputs and tool calls are turning models into orchestration layers that decide when to think, when to search, and when to act in OpenAI GPT‑5 Codex Max. That same orchestration logic will increasingly govern which commercial entities an assistant consults when fulfilling user requests. If your brand is invisible at that layer—no machine-readable documentation, no llms.txt, no APIs—you are competing for attention in a game whose rules you do not understand.

The hopeful twist is that agent optimization rewards depth over clickbait. Models are more likely to rely on sources that are technically precise, internally consistent, and comprehensive. They are less swayed by headline theatrics and more by clear explanations and well-structured data. If we design the incentives carefully, AI SEO could tilt the web back toward quality—provided we resist the temptation to let ad auctions quietly override that logic behind opaque ranking systems.

A hopeful operator checklist for the agentic web

If ads in ChatGPT and other assistants are inevitable, the pressing question for operators, founders, and technologists is how to steer the transition toward something healthier than the last ad boom. We will not get everything right, but we do have levers. Here is a pragmatic checklist for the next few years.

First, treat llms.txt as a baseline, not an experiment. Even if current model providers only partially honor it, publishing a clear llms.txt file signals your preferences about training, retrieval, and attribution (https://www.semrush.com/blog/llms-txt/). Start simple: designate sensitive sections of your site as off‑limits, explicitly allow high‑level educational content to be used, and include links to your licensing terms. As standards mature, you can refine the granularity, but the important step is to claim a seat at the table instead of passively accepting whatever defaults models infer.

Second, design for agent readability. That means leaning into structured data (schema.org markup, JSON‑LD), consistent heading hierarchies, descriptive link text, and stable identifiers for key entities. Write with the expectation that a model will quote you, paraphrase you, or aggregate you. When you explain a concept, imagine a future assistant needing to lift a paragraph verbatim to ground an answer; clarity and self‑containment become assets. This is tedious work, but it pays off twice: humans benefit from structure, and models do too.

Third, instrument brand mentions inside AI ecosystems wherever you can. Today, that may mean partnering with platforms that provide visibility into how their assistants surface your content or APIs. Tomorrow, it will likely involve third‑party tools that sample AI outputs at scale and annotate when and how your brand appears. The point is not surveillance for its own sake; it is to build a feedback loop. If sponsored slots begin to crowd out organic references in ways that hurt users, you want to know early.

Fourth, participate in the norms around ad disclosure. Right now, the Android strings speak of “search ads” and a “carousel,” but they do not tell us how clearly those ads will be labeled or how their influence will be explained (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). As a user, insist on visible “sponsored” markers and accessible explanations of why a given suggestion is being promoted. As a builder integrating assistant APIs into your products, bake those disclosures into your own UI. Small decisions—font size, color balance, phrasing—compound into cultural expectations.

Fifth, align your own use of AI with the standards you want from platforms. If your product uses a model to generate recommendations, be explicit about which parts are sponsored, which are organic, and which rely on user data. Do not quietly smuggle in affiliate priorities under the guise of neutral advice. The fastest way to normalize manipulative AI advertising is for smaller products to cut the same corners they decry in larger platforms.

Sixth, explore business models that complement rather than purely replace advertising. Subscriptions, usage-based pricing for high‑value tools, and enterprise tiers can all coexist with a light ad load. The more diversified the revenue stack behind AI assistants, the less pressure there is to crank up ad density or mine ever more personal data for marginally better targeting. Operators can vote with their integrations: prefer assistants and platforms whose economic incentives are legible and not wholly ad‑driven.

Finally, cultivate a hopeful stance rooted in realism. The BleepingComputer leak is a reminder that even groundbreaking technologies eventually collide with the gravity of business models (https://www.bleepingcomputer.com/news/artificial-intelligence/leak-confirms-openai-is-preparing-ads-on-chatgpt-for-public-roll-out/). But it is also proof that we are early enough in the AI assistant era to shape defaults. A world where ChatGPT shows clearly labeled, contextual ads in a search pane—while respecting llms.txt, honoring privacy boundaries, and rewarding high‑quality sources—would still be a dramatic improvement over the banner‑choked web of the 2000s.

The stakes are not abstract. If we get this wrong, the agentic interfaces that promised to reduce noise could end up amplifying it, steering people toward the highest bidder rather than the best answer. If we get it right, assistants can become honest brokers: tools that make our intent legible, surface genuinely helpful options (some sponsored, many not), and channel economic value back to the people who create and maintain the knowledge they rely on. The leak about “bazaar content” is a warning—but it is also an invitation to build an economy around AI that is more aligned with users than the last one. The work starts now, in the quiet choices we make about standards like llms.txt, about how we structure our content, and about which platforms we trust with our prompts.