That is Half 2 in a five-part sequence on optimizing web sites for the agentic internet. Half 1 coated the evolution from search engine marketing to AAIO and why the shift issues. This text will get sensible: how AI methods truly choose content material, and what you are able to do about it.
AI Doesn’t Rank Pages. It Selects Fragments.
Conventional search ranks entire pages. AI search does one thing basically completely different.
Microsoft’s Krishna Madhavan, principal product supervisor on the Bing crew, described the shift in October 2025: AI assistants “break content material down, a course of referred to as parsing, into smaller, structured items that may be evaluated for authority and relevance. These items are then assembled into solutions, typically drawing from a number of sources to create a single, coherent response.”
That is the core perception. AI doesn’t choose the very best web page and present it. It picks the very best fragments from many pages and weaves them collectively. Your web page would possibly rank No. 1 on Google and nonetheless not get cited in an AI response if its content material isn’t structured in fragments that AI can extract.
The numbers present the shift is actual. Based on the Conductor AEO/GEO Benchmarks Report (January 2026; 13,770 domains, 17 million AI responses), AI visitors now accounts for 1.08% of all web site classes, rising roughly 1% month over month. Microsoft reported that AI referrals to prime web sites spiked 357% year-over-year in June 2025, reaching 1.13 billion visits. Small numbers immediately, compounding quick.
One in 4 Google searches now triggers an AI Overview. In healthcare, it’s practically one in two. The floor space is rising, and the content material that fills these solutions has to return from someplace. The query is whether or not it comes from you.
The Analysis: What Truly Will get Cited
The tutorial analysis on what makes content material citable in AI responses has matured quickly. The foundational paper, “GEO: Generative Engine Optimization” (Princeton, IIT Delhi, Georgia Tech, printed at KDD 2024), examined 9 optimization methods and located that GEO methods may enhance visibility by as much as 40% in AI responses. The best single method was citing credible sources, which produced a 115.1% visibility improve for web sites that weren’t already rating within the prime positions.
A counterintuitive discovering: Writing in an authoritative or persuasive tone didn’t enhance AI visibility. AI methods don’t reply to rhetorical model. They reply to verifiable data.
Since then, 2025 introduced a wave of follow-up analysis that examined these concepts on actual manufacturing AI engines moderately than simulated ones.
The College of Toronto examine (September 2025) was the primary large-scale evaluation throughout ChatGPT, Perplexity, Gemini, and Claude. Their most putting discovering: AI search overwhelmingly favors earned media. In shopper electronics, AI cited third-party authoritative sources 92.1% of the time, in comparison with Google’s 54.1%. Automotive confirmed an analogous sample at 81.9% versus 45.1%. In different phrases, it’s not simply the way you write content material, however whose area it seems on. Press protection, product critiques on impartial web sites, and mentions on business publications carry much more weight in AI responses than your individual web site.
Carnegie Mellon’s AutoGEO examine (October 2025) used automated strategies to find what generative engines truly favor. The outcomes confirmed as much as 50.99% enchancment over the very best baseline, with common preferences rising throughout engines: complete subject protection, factual accuracy with citations, clear logical construction with headings and lists, and direct solutions to queries.
The GEO-16 framework (September 2025) analyzed 1,702 actual citations from Courageous, Google AI Overviews, and Perplexity. It recognized 16 on-page high quality elements that predict quotation probability. The highest three: metadata and freshness, semantic HTML, and structured information. Technical on-page elements matter as a lot as the standard of the writing itself.
And a actuality verify from Columbia and MIT’s ecommerce examine (November 2025): of 15 widespread content material rewriting heuristics, 10 produced negligible or detrimental outcomes. The optimization methods that did work converged towards truthfulness, person intent alignment, and aggressive differentiation. Not tips. Substance.
The general sample throughout all of this analysis: AI methods reward readability, factual accuracy, and construction. They don’t reward advertising and marketing language, persuasion techniques, or key phrase density.
Content material Construction That Earns Citations
Primarily based on the analysis and official steering from Microsoft and Google, right here’s what structurally makes content material citable.
Heading hierarchy issues greater than ever. Use descriptive H2 and H3 headings that every cowl one particular thought. Microsoft lists robust headings as “alerts that assist AI know the place a whole thought begins and ends.” Imprecise headings like “Study Extra” or “Overview” give AI nothing to work with. A heading like “How AI parses content material in another way than search engines like google and yahoo” tells the system precisely what the part incorporates.
Q&A format is native to AI. Write questions as headings with direct solutions under them. Microsoft notes that “assistants can typically carry these pairs phrase for phrase into AI-generated responses.” In case your content material solutions the query somebody asks an AI, and it’s structured as a transparent question-and-answer pair, you’ve made the AI’s job simple.
Make content material snippable. Bulleted and numbered lists, comparability tables, step-by-step directions. These codecs give AI clear, extractable fragments. A paragraph buried in a wall of textual content is tougher for AI to isolate than the identical data offered as a three-item record.
Entrance-load the reply. Begin sections with the important thing data, then present context. If somebody asks, “What temperature ought to I bake bread at?” and your content material opens with a two-paragraph historical past of bread making earlier than mentioning 375°F, you’ll lose the quotation to a competitor who leads with the reply.
Preserve sections self-contained. Every part ought to make sense by itself, with out requiring the reader to have learn the earlier part. AI extracts fragments. In case your fragment solely is smart within the context of the entire web page, it received’t be chosen.
An vital technical observe from Microsoft: “Don’t disguise vital solutions in tabs or expandable menus: AI methods could not render hidden content material, so key particulars may be skipped.” FAQ solutions collapsed inside an expandable menu, product specs hidden behind tabs, content material that requires interplay to disclose: it could all be invisible to AI. If data is vital, it must be within the seen HTML.
Authority Alerts For AI
E-E-A-T (Expertise, Experience, Authoritativeness, Trustworthiness) isn’t only a Google idea anymore. It’s what AI methods search for throughout the board, even when they don’t use the time period.
Microsoft’s October 2025 steering describes the baseline: success begins with content material that’s “contemporary, authoritative, structured, and semantically clear.” On the readability aspect, they’re particular: “keep away from obscure language. Phrases like revolutionary or eco imply little with out specifics. As a substitute, anchor claims in measurable details.” Saying one thing is “next-gen” or “cutting-edge” with out context leaves AI not sure the right way to classify it.
The analysis backs this up. The unique GEO paper discovered that writing in a persuasive or authoritative tone didn’t enhance AI visibility. Info and cited sources did. Advertising and marketing language doesn’t impress algorithms.
This connects to the College of Toronto’s discovering about earned media dominance. AI methods belief third-party validation greater than self-promotion. In shopper electronics, AI cited third-party authoritative sources 92.1% of the time in comparison with Google’s 54.1%. The implication: getting your experience printed on business web sites, incomes press protection, and constructing a presence on authoritative platforms issues extra for AI visibility than perfecting the copy by yourself web site.
Freshness is a sign, not a bonus. Stale content material not often will get cited. Krishna Madhavan mentioned at Pubcon Cyber Week: “Stale or lacking content material will constrain the quantity of retrieval we will do and push brokers towards different sources.”
Schema Markup: From Textual content To Information
Microsoft’s October 2025 put up devotes a whole part to schema. They describe it as code that “turns plain textual content into structured information that machines can interpret with confidence.” Schema can label your content material as a product, evaluate, FAQ, or occasion, giving AI methods express context as a substitute of forcing them to guess. Krishna Madhavan bolstered this at Pubcon: “Schemas are tremendous helpful. They assist the system discern precisely what your data is with out us having to guess.”
The GEO-16 framework confirms this from the tutorial aspect. Structured information was one of many prime three elements predicting AI quotation probability, alongside metadata/freshness and semantic HTML.
The schema varieties that matter most for AI visibility:
- FAQPage for question-and-answer content material (instantly maps to how AI codecs responses).
- HowTo for step-by-step directions.
- Product with Supply, AggregateRating, and Overview for ecommerce.
- Article/BlogPosting for content material with clear authorship and dates.
- Group for enterprise id.
Pair structured information with IndexNow for freshness. Because the Bing Webmaster Weblog put it: “IndexNow tells search engines like google and yahoo that one thing has modified, whereas structured information tells them what has modified. Collectively, they enhance each pace and accuracy in indexing.”
Crawler Permissions: Who Will get In
AI search engines like google and yahoo use distinct crawlers, and most allow you to management coaching and search entry individually. Right here’s who to permit.
| Bot | Platform | Goal | Robots.txt Token |
|---|---|---|---|
| OAI-SearchBot | ChatGPT | Search index | OAI-SearchBot |
| GPTBot | OpenAI | Mannequin coaching | GPTBot |
| ChatGPT-Person | ChatGPT | On-demand shopping | ChatGPT-Person |
| Bingbot | Microsoft Copilot | Search + AI | Bingbot |
| Googlebot | Google AI Overviews | Search + AI | Googlebot |
| Google-Prolonged | Gemini coaching | Google-Prolonged |
|
| PerplexityBot | Perplexity | Search + index | PerplexityBot |
| Perplexity-Person | Perplexity | On-demand shopping | Perplexity-Person |
| ClaudeBot | Anthropic | Coaching + retrieval | ClaudeBot |
A smart robots.txt configuration would possibly enable search crawlers whereas blocking coaching:
Person-agent: OAI-SearchBot
Enable: /
Person-agent: ChatGPT-Person
Enable: /
Person-agent: GPTBot
Disallow: /
Person-agent: Google-Prolonged
Disallow: /
OpenAI gives the cleanest bot separation. You may enable OAI-SearchBot (so your content material seems in ChatGPT search) whereas blocking GPTBot (so it’s not used for mannequin coaching). Google’s controls are much less granular: blocking Google-Prolonged prevents Gemini coaching however has no impact on AI Overviews, which use Googlebot.
OpenAI additionally provides the most particular technical advice of any AI search supplier. For his or her Atlas browser (which makes use of an ordinary Chrome person agent, not a bot identifier), they suggest following WAI-ARIA finest practices: “Add descriptive roles, labels, and states to interactive parts like buttons, menus, and kinds. This helps ChatGPT acknowledge what every ingredient does and work together along with your web site extra precisely.” Accessibility and AI agent compatibility are the identical work.
A caveat on Perplexity: whereas their documentation states they respect robots.txt, Cloudflare documented in August 2025 that Perplexity makes use of undeclared crawlers with rotating IPs and spoofed browser person brokers to bypass no-crawl directives. This can be a contested declare, but it surely’s price figuring out.
For income, Perplexity is the one platform at present providing writer compensation. Their Comet Plus program gives an 80/20 income cut up (publishers maintain 80%) throughout direct visits, search citations, and agent actions.
Google Vs. Microsoft: Two Philosophies
The distinction between Google and Microsoft on AEO is putting sufficient to be its personal story.
Google says: simply do good search engine marketing. Their official documentation is intentionally minimalist: “There aren’t any further necessities to seem in AI Overviews or AI Mode, nor different particular optimizations obligatory.” They add that you simply “don’t have to create new machine readable information, AI textual content information, or markup to seem in these options.”
Google recommends useful, dependable, people-first content material demonstrating E-E-A-T. Normal structured information. Good web page expertise. Technical fundamentals. Nothing AI-specific.
Microsoft says: right here’s the playbook. Their October 2025 weblog put up and January 2026 information present detailed, actionable steering. Particular heading buildings. Schema suggestions. Content material formatting guidelines. Concrete examples (an AEO product description vs. a GEO product description). Warnings about content material hidden in tabs and expandable menus. A framework for occupied with crawled information, product feeds, and stay web site information as three distinct layers.
What explains the distinction? Partly market place. Google dominates search and has much less incentive to assist publishers optimize for AI options that may cut back clicks to their web sites. Microsoft, with Bing’s roughly 8% market share, advantages from offering publishers with causes to optimize particularly for his or her ecosystem.
However there’s a sensible takeaway: Microsoft’s steering isn’t Bing-specific. The ideas of structured content material, clear headings, snippable codecs, schema markup, and knowledgeable authority are common. Following Microsoft’s playbook improves your content material for each AI system, together with Google’s. Google simply received’t inform you that.
Measuring AI Visibility
That is the exhausting half. Conventional search engine marketing has Google Search Console. AI visibility remains to be fragmented.
Ahrefs analyzed 1.9 million citations from 1 million AI Overviews and located that 76% of citations come from pages already rating in Google’s prime 10. The median rating for the most-cited URLs was place 2. Conventional rating nonetheless issues for AI quotation, however being No. 1 is “a coin flip at finest” for getting cited.
The visitors affect is critical. Ahrefs discovered that AI Overviews correlate with 58% decrease click-through charges for the No. 1 place. Seer Interactive reported a 61% natural CTR drop for queries with AI Overviews. However being cited inside the AI Overview provides 35% extra natural clicks in comparison with not being cited. Quotation is the brand new rating.
For monitoring, the device panorama is rising:
| Software | What It Tracks | Beginning Value |
|---|---|---|
| Profound | Citations throughout ChatGPT, Perplexity, Copilot, Google AIOs | From $99/mo |
| Peec.ai | Model mentions throughout ChatGPT, Gemini, Claude, Perplexity | From ~$95/mo |
| Superior Internet Rating | AIO presence monitoring in Google | Included in plans |
| Bing Webmaster Instruments | AI Efficiency Report for Copilot | Free |
Bing Webmaster Instruments is the best place to begin. It’s free, and the brand new AI Efficiency Report reveals how your content material performs in Copilot citations. For ChatGPT particularly, monitor utm_source=chatgpt.com in your analytics. OpenAI mechanically appends this to referral URLs.
Conductor’s January 2026 report discovered that 87.4% of AI referral visitors comes from ChatGPT. That’s one platform dominating the area, which makes monitoring it significantly vital.
Key Takeaways
- AI selects fragments, not pages. Construction your content material in self-contained, extractable sections with descriptive headings that sign the place every thought begins and ends.
- Readability beats persuasion. Factual accuracy, cited sources, and direct solutions outperform authoritative tone and advertising and marketing language. The analysis constantly reveals this.
- Earned media dominates model content material in AI citations. Press protection, third-party critiques, and authoritative mentions on different web sites carry extra weight than your individual pages. Construct presence past your area.
- Schema markup is a pressure multiplier. FAQPage, HowTo, Product, and Article schemas make your content material machine-readable. Pair with IndexNow for freshness.
- Comply with Microsoft’s playbook, even for Google. Google says “simply do good search engine marketing.” Microsoft gives particular, actionable steering that improves content material for each AI system, Google’s included.
- Separate coaching from search in your robots.txt. Enable search crawlers (OAI-SearchBot, Bingbot, PerplexityBot) whereas blocking coaching crawlers (GPTBot, Google-Prolonged) if that’s your choice. You’ve got extra management than you would possibly suppose.
- Monitor AI visibility now. Use Bing Webmaster Instruments (free), monitor
utm_source=chatgpt.comin analytics, and contemplate devoted instruments because the measurement area matures.
Conventional search engine marketing requested: “How do I rank?” AEO asks: “How do I develop into the fragment that will get chosen?” The reply isn’t a single trick. It’s clear construction, verifiable experience, and content material that AI can confidently extract and cite.
Up subsequent in Half 3: the protocols powering the agentic internet, together with MCP, A2A, NLWeb, and AGENTS.md, and the way they match collectively.
Extra Assets:
This was initially printed on No Hacks.
Featured Picture: Meepian Graphic/Shutterstock
