Schema.org structured data is JSON markup that tells search engines and AI systems what your content actually is — and getting it right is the single most underrated GEO lever in 2026, with citation rates running 2-3x higher on properly-marked content compared to unmarked equivalents. When Google AI Overviews cites 500k.io on a query, the citation almost always points to a page where the FAQPage schema and the visible FAQ section match exactly. When they don’t match, AIO ignores the page entirely. This is the difference between getting cited and being invisible.

This article is the exact schema architecture I run on 500k.io, the 6 schema types that pay back the time investment, the JSON-LD examples you can copy, and the 3 mistakes that kill AI discoverability. If you’ve read how I track AI citations, this is the structured-data layer that makes citations possible in the first place.

Why schema matters more in 2026 than it did in 2023

In the old SEO world, schema was a nice-to-have. It might earn you rich snippets (star ratings, FAQ accordions in search results) but rankings primarily depended on links and content.

In 2026, the AI engines that summarize the web for users need to understand what your content IS — not just what it says. Schema.org is the standard machine-readable description of “what this page is.”

EraSchema’s role
2018-2022Optional, mostly for rich snippets
2023-2024Useful, started correlating with AIO citations
2025-2026Critical, especially for AI engines making citation decisions
2027+Likely required for any AI search visibility

According to a March 2026 study by AdAmigo on AI search structured data, pages with comprehensive schema markup were cited in AI Overviews at roughly 2.3x the rate of equivalent unmarked content. Perplexity citations correlated similarly at ~2.1x. The pattern is consistent across engines.

Schema is no longer optional for serious GEO work.

The 6 schema types that matter

The schema.org vocabulary contains 800+ types. Ignore 794 of them. The 6 that pay back the time investment:

1 — Article (or BlogPosting)

The base type for every editorial page. Fields that matter:

{
  "@type": "Article",
  "headline": "Article title",
  "description": "Meta description",
  "datePublished": "2026-05-19T08:42:00Z",
  "dateModified": "2026-05-19T10:00:00Z",
  "author": {
    "@type": "Person",
    "name": "Maxime Le Morillon",
    "url": "https://500k.io/about"
  },
  "publisher": {
    "@type": "Organization",
    "name": "500k.io",
    "logo": {
      "@type": "ImageObject",
      "url": "https://500k.io/logo.png"
    }
  },
  "image": "https://500k.io/img/articles/example.webp",
  "mainEntityOfPage": "https://500k.io/journal/example"
}

The 3 fields that matter most for AI engines: headline, datePublished + dateModified (freshness signal), author (E-E-A-T signal).

2 — FAQPage

The single highest-ROI schema type for AI citations in 2026. Use it on any page with 3+ Q&A pairs.

{
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is Claude Code?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Claude Code is Anthropic's CLI coding agent that runs on your local machine, reads your filesystem, edits files, and commits to git."
      }
    },
    {
      "@type": "Question",
      "name": "How much does Claude Code cost?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Claude Code's Pro tier is $20/month. The Max 5x tier is $100/month and covers most solo founder use cases."
      }
    }
  ]
}

The critical rule: the questions and answers in your schema MUST match the visible content. If they don’t, you’ve created a schema/content mismatch (mistake #1 below).

AI Overviews especially loves FAQPage. Pages with 4+ FAQ pairs in schema correlate with citation rates 3x higher than pages without.

3 — HowTo

The right schema for tutorial content. Use on any page that lays out an ordered process.

{
  "@type": "HowTo",
  "name": "How to set up Claude Code",
  "description": "A step-by-step guide to setting up Claude Code on your local machine.",
  "totalTime": "PT30M",
  "estimatedCost": {
    "@type": "MonetaryAmount",
    "currency": "USD",
    "value": "100"
  },
  "step": [
    {
      "@type": "HowToStep",
      "name": "Install Claude Code",
      "text": "Run `npm install -g @anthropic-ai/claude-code` in your terminal.",
      "url": "https://500k.io/journal/claude-code-first-30-days#step-1"
    },
    {
      "@type": "HowToStep",
      "name": "Authenticate",
      "text": "Run `claude /login` and follow the OAuth prompt.",
      "url": "https://500k.io/journal/claude-code-first-30-days#step-2"
    }
  ]
}

HowTo schema works particularly well for AIO citations on “how to X” queries. Make sure step name is short and text is a clear instruction.

4 — BreadcrumbList

The structural type that tells search engines where this page sits in your hierarchy.

{
  "@type": "BreadcrumbList",
  "itemListElement": [
    {
      "@type": "ListItem",
      "position": 1,
      "name": "Home",
      "item": "https://500k.io"
    },
    {
      "@type": "ListItem",
      "position": 2,
      "name": "Journal",
      "item": "https://500k.io/journal"
    },
    {
      "@type": "ListItem",
      "position": 3,
      "name": "Schema for AI search",
      "item": "https://500k.io/journal/schema-org-for-ai-search-deep-dive"
    }
  ]
}

Less critical for AI citations but helps Google understand site structure. Ship it on every non-homepage page.

5 — Organization

The “who runs this site” type. Lives on the homepage and as a reference from every Article schema.

{
  "@type": "Organization",
  "name": "500k.io",
  "url": "https://500k.io",
  "logo": "https://500k.io/logo.png",
  "description": "The live journal of a solo founder going from $0 to $500K ARR with AI as the only team.",
  "sameAs": [
    "https://twitter.com/theKreators_ai",
    "https://linkedin.com/in/maximelemorillon",
    "https://github.com/maximelemorillon"
  ],
  "parentOrganization": {
    "@type": "Organization",
    "name": "The Kreators AI",
    "url": "https://thekreators.ai"
  }
}

The sameAs array is one of the most powerful AI signals. It tells engines “the entity behind this site has the same identity as these other entities.” Wire your social profiles, your GitHub, and your parent organization (if any).

6 — Person

The author entity. Used in Article schemas and as a standalone on author pages.

{
  "@type": "Person",
  "name": "Maxime Le Morillon",
  "url": "https://500k.io/about",
  "image": "https://500k.io/img/maxime.webp",
  "sameAs": [
    "https://twitter.com/theKreators_ai",
    "https://linkedin.com/in/maximelemorillon"
  ],
  "jobTitle": "Founder",
  "worksFor": {
    "@type": "Organization",
    "name": "The Kreators AI"
  },
  "description": "10-year entrepreneur, 4 years deep in AI. Co-founder of The Kreators AI ($45M agency with Jack)."
}

Person schema is the E-E-A-T signal AI engines now weight heavily. If you’re building a personal-brand site, this matters as much as the Article schema.

The @graph structure I run on 500k.io

The professional move is to wire all the schemas on a page into a single @graph block. Each type has a unique @id and references the others. This is the structure that earns the cleanest validation and the highest citation lift.

Here’s what runs on every article page on 500k.io (simplified for brevity):

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "Article",
      "@id": "https://500k.io/journal/[slug]#article",
      "headline": "...",
      "datePublished": "...",
      "dateModified": "...",
      "author": { "@id": "https://500k.io/about#maxime" },
      "publisher": { "@id": "https://500k.io#org" },
      "mainEntityOfPage": "https://500k.io/journal/[slug]"
    },
    {
      "@type": "FAQPage",
      "@id": "https://500k.io/journal/[slug]#faq",
      "mainEntity": [ ... ]
    },
    {
      "@type": "BreadcrumbList",
      "@id": "https://500k.io/journal/[slug]#breadcrumbs",
      "itemListElement": [ ... ]
    },
    {
      "@type": "Person",
      "@id": "https://500k.io/about#maxime",
      "name": "Maxime Le Morillon",
      ...
    },
    {
      "@type": "Organization",
      "@id": "https://500k.io#org",
      "name": "500k.io",
      ...
    }
  ]
}
</script>

The @id wiring is what differentiates a nested @graph from a list of loose schemas. AI engines can traverse the relationships: Article → Author (Person) → Organization. The structure tells them “this article is by this person who works for this organization.”

If you take only one thing from this article: ship the @graph structure. Don’t ship 4 separate <script> tags with disconnected schemas. The nested version is significantly more powerful.

The 3 mistakes that kill discoverability

Mistake 1 — Schema/content mismatch

The schema says one thing, the visible page says another. Common examples:

  • FAQPage schema includes a question not visible on the page
  • HowTo schema lists steps not present in the article body
  • Article headline differs from the visible <h1>
  • datePublished doesn’t match the visible publish date

This is a critical failure. Google flags it as “misleading structured data” and either ignores the schema or, worse, removes the page from rich-results candidacy. AI engines may also de-prioritize the page entirely.

The fix: every schema field that has a visible analog on the page must match exactly. Build your templates so the schema reads from the same data source as the visible content. Don’t hand-write schema separately.

Mistake 2 — Broken nested types

Common when copy-pasting schema from blog posts. You wire author: { "@type": "Person", "name": "X" } but reference an @id elsewhere that doesn’t exist. Or you reference a Person @id that has no corresponding Person object in the graph.

The validators catch this. Run Google’s Rich Results Test on every new page template before shipping. If it shows errors, fix before deploy.

Mistake 3 — Missing or wrong @id wiring

If you’re using @graph, every entity needs a unique @id and references should resolve. Common errors:

  • Article references publisher but doesn’t @id-link to the Organization
  • Organization referenced multiple times but with slight URL variations (with vs without trailing slash, http vs https)
  • Person @id differs between pages (one says /about#maxime, another says /author/maxime#person)

The fix: pick canonical @id URLs for your Person, Organization, and Article types and use them consistently across every page.

Testing schema before you ship

Three tools I use, in order:

Google Rich Results Test

The official tool. Tests whether your schema qualifies for rich snippets and which types are detected.

Schema Markup Validator

The Schema.org validator. Tests pure schema.org compliance (vs just Google’s subset).

  • URL: https://validator.schema.org/
  • Use case: pages where you use schema types Google doesn’t support but AI engines may
  • What it catches: vocabulary errors, deprecated fields, type relationship issues

Bing URL Inspection (in Webmaster Tools)

Same idea as Google’s Search Console, for Bing.

  • URL: Bing Webmaster Tools
  • Use case: verifying Bing’s index is parsing your schema
  • What it catches: Bing-specific edge cases

The discipline: every new page template gets validated in tools 1 and 2 before deploy. Tool 3 is a monthly sanity check.

Schema for AI engines (the 2026 nuances)

Some specific things AI engines look for that aren’t well-documented:

Speakable schema (read-aloud)

For content meant to be read by voice assistants:

{
  "@type": "WebPage",
  "speakable": {
    "@type": "SpeakableSpecification",
    "cssSelector": ["h1", ".tldr-box", ".summary-paragraph"]
  }
}

Useful if your audience is likely to use Google Assistant, Alexa, or similar voice search. Marginal for most founder use cases but ship if your content has a strong audio dimension.

MainEntity wiring

For articles that are explicitly ABOUT a specific thing (a tool, a person, a concept), wire mainEntity to a dedicated entity. Example for a tool review:

{
  "@type": "Review",
  "itemReviewed": {
    "@type": "SoftwareApplication",
    "name": "Claude Code",
    "applicationCategory": "DeveloperApplication"
  },
  "author": { "@id": "https://500k.io/about#maxime" },
  "reviewRating": {
    "@type": "Rating",
    "ratingValue": "4.5",
    "bestRating": "5"
  }
}

This helps AI engines understand “this article is reviewing X” rather than “this article mentions X.” Cleaner attribution.

Self-referencing mainEntityOfPage

A small detail with outsized impact: include mainEntityOfPage pointing to the canonical URL of the article. Sounds redundant but it’s the field Google uses to confirm “this is the main content of this URL.”

"mainEntityOfPage": "https://500k.io/journal/schema-org-for-ai-search-deep-dive"

Always include it. Always exact match to canonical URL.

How to start: the 30-minute schema MVP

If your site has zero schema today, the 30-minute starter:

  1. Ship Article schema on every blog post (15 min — implement once in your template)
  2. Ship Organization + Person schema on the homepage and about page (10 min)
  3. Validate everything with Rich Results Test (5 min)

Don’t try to ship all 6 types on day 1. Get Article + Organization + Person right first. Add FAQPage and HowTo when you have content that warrants them. Add BreadcrumbList when you’re cleaning up site structure.

The full @graph upgrade comes when you’ve shipped 30+ articles and want the citation lift. Don’t optimize for citations on day 1 of a new site; ship the basic types and earn the right to optimize.

The honest single-paragraph summary

Schema.org markup is the structured-data layer AI engines use to understand what your content is. In 2026, properly-marked content gets cited at 2-3x the rate of unmarked equivalents. The 6 types that matter for solopreneurs: Article, FAQPage, HowTo, BreadcrumbList, Organization, Person. Use JSON-LD, wire everything in a single @graph block with @id references, validate with Google’s Rich Results Test, ship. Avoid the 3 mistakes (schema/content mismatch, broken nested types, inconsistent @id). Schema is no longer optional for serious GEO work — it’s the difference between citation eligibility and invisibility.

For the wider GEO ecosystem, see how I track AI citations and the broader marketing automation playbook. For the operational layer, n8n + AI workflows covers the automation that ships and maintains schema at scale.

FAQ

Do AI engines actually use schema.org markup?

Yes. Google AI Overviews uses schema heavily — articles with FAQPage and HowTo schema get cited at roughly 2-3x the rate of unstructured equivalents. Perplexity uses it indirectly (via Google's index). ChatGPT's web search uses it less directly but the underlying training data is shaped by schema-aware crawlers. Don't skip it.

Which schema types matter most for solopreneurs?

Six that pay back the time investment: Article, FAQPage, HowTo, BreadcrumbList, Organization, Person. The other 800 types are mostly noise for solo founder use cases. Get those 6 right and you're at 90% of the achievable schema lift.

JSON-LD or microdata?

JSON-LD. Always. Microdata is dying and Google's own docs prefer JSON-LD. It's also easier to maintain in a content collection system because the schema lives in your template, not interleaved with your content.

How do I test my schema?

Two tools: Google's Rich Results Test (free) for validation, and Schema.org's Schema Markup Validator for compliance. Run both on every new page template. If either errors, fix before shipping. Errors don't just lose ranking — they can hide your page from AI engines entirely.

What's the most common schema mistake?

Inconsistency between schema and visible content. If your FAQPage schema says you answer a question but the visible page doesn't include that exact answer, you risk being flagged as misleading. The schema must reflect what's actually on the page. AI engines cross-check.