ai-search-seo-406.jpg
April 20, 2026

68 Million AI Crawler Visits Show What Drives AI Search Visibility


A new analysis of 858,457 sites hosted on the Duda platform shows how AI crawlers are interacting with websites at scale. The data offers a clearer view of how crawling activity is growing and what SEOs and businesses should do to increase traffic from AI search.

AI Crawling Has Already Reached Scale

AI crawling is growing quickly, with more requests tied to real-time answers and most of that activity coming from a single provider. The data creates a pattern that shows which sites are being crawled and more importantly, why.

Year-Over-Year Growth In LLM Referrals

LLM referral traffic has increased sharply over the past year, with multiple platforms showing meaningful gains from very different starting points.

AI Referral Traffic Patterns

  • Total LLM referrals: 93,484 to 161,469 (+72.7%)
  • ChatGPT: 81,652 to 136,095 (+66.7%)
  • Claude: 106 to 2,488 (23x growth)
  • Copilot: 22 to 9,560 (from near-zero)
  • Perplexity: 11,533 to 13,157 (+14.1%)

Growth is not happening evenly, but across the board, referral traffic from AI systems is increasing. That makes AI-generated discovery a growing source of traffic, not a marginal one.

Crawlers Are Increasingly Fetching Content To Ground Answers

AI crawlers are no longer used primarily for indexing, with most activity now tied to retrieving content in real time to generate answers for users.

Most crawling is now happening in response to user queries rather than for building an index, which changes how content is accessed and used.

  • User Fetch (real-time answers): 56.9% of all crawler activity, driven almost entirely by ChatGPT
  • Training (model learning): 28.8%, split across GPTBot and other model crawlers
  • Discovery (content indexing): 14.3%, distributed across multiple systems
  • ChatGPT User Fetch volume: ~39.8 million visits

The trends are largely driven by ChatGPT, which is responsible for nearly all real-time retrieval activity. That means the move toward answer-based crawling is not evenly distributed, but concentrated in one platform shaping how content is accessed. This trend may change with Google’s new Google-Agent crawler.

Market Concentration In AI Crawling

AI crawler activity is heavily concentrated, with OpenAI responsible for the vast majority of requests, reflecting its position as the primary tool users rely on to find and retrieve information.

  • OpenAI: 55.8 million visits (81.0%)
  • Anthropic (Claude): 11.5 million (16.6%)
  • Perplexity: 1.3 million (1.8%)
  • Google (Gemini): 380,000 (0.6%)

Most AI crawling activity comes from OpenAI, which aligns with ChatGPT’s role as a primary tool for finding and retrieving information. Claude follows at a much smaller share, suggesting a different usage pattern, while the rest of the market accounts for a minimal portion of crawler activity.

Scale And What That Actually Means

AI crawling is already operating across a large portion of the web, reaching hundreds of thousands of sites and generating tens of millions of requests in a single month.

More than half of all sites in the dataset received at least one AI crawler visit, showing that this activity is not limited to a small subset of websites.

  • Total sites analyzed: 858,457
  • Sites with at least one AI crawler visit: 506,910 (59%)
  • Total AI crawler visits (Feb 2026): 68.9 million

AI crawling is not isolated to high-profile or heavily trafficked sites. It is already widespread, with consistent activity across a majority of the web.

The Relationship Between Crawling and Real Traffic

Sites that allow AI systems to crawl them consistently show stronger engagement across multiple metrics.

What the data actually shows is:

  1. Sites that allow AI crawling receive significantly more human traffic
  2. Higher-traffic sites are more likely to be crawled

Sites that allow crawling by AI systems receive significantly more human traffic, averaging 527.7 sessions compared to 164.9 for sites that are not crawled. This does not establish causation, but it shows a clear alignment between sites that attract human visitors and how often AI systems revisit them.

  • Average human traffic (AI-crawled vs not): 527.7 vs 164.9 (3.2x higher)
  • Average form completions: 4.17 vs 1.57 (2.7x higher)
  • Averageclick-to-call: 8.62 vs 3.46 (2.5x higher)
  • Sites with 10K+ sessions: 90.5% crawl rate

AI systems are not discovering weak or inactive sites and lifting them up. They are returning to sites that already attract human visitors. For marketers, that shifts the focus away from trying to “get crawled” and toward building real audience demand, since visibility in AI systems appears to follow it.

What Correlates With More Crawling

The research compared sites that include specific third-party integrations, structured features, and content depth with those that do not and found which ones mattered most for AI crawler activity and referrals.

Across the dataset, 59% of sites received at least one AI crawler visit in February 2026. Sites that are crawled more often tend to combine three types of signals: external integrations, structured business data, and content depth.

1. External Integrations

These integrations connect the site to external systems that validate and distribute business information.

  • Yext integration: 97.1% crawl rate vs ~58% without (+38.9pp)
  • Reviews integrations: 89.8% crawl rate vs 58.8% without, 376.9 average crawler visits

Sites that are connected to external data and review systems are crawled more often and more frequently, indicating that AI systems rely on these integrations as signals that a business is real, verifiable, and worth revisiting.

2. Structured Site Features And Business Data

These are built into the site and help AI systems understand and verify business identity.

  • Google Business Profile sync: 92.8% crawl rate vs 58.9% without, 415.6 average crawler visits
  • Local schema: 72.3% vs 55.2% (+17.1pp), 22.3% adoption
  • Dynamic pages: 69.4% vs 58.2% (+11.2pp)
  • Ecommerce: 54.2% vs 59.2% (-5.0pp)

Sites that clearly define their business identity and structure their information in a machine-readable way are crawled more often, showing that AI systems favor sites they can easily interpret, verify, and extract information from.

3. Content Depth (Volume Of Usable Data)

Sites with more content provide more opportunities for AI systems to retrieve, reference, and reuse information in responses.

  • Sites with 50+ blog posts: 1,373.7 average crawler visits vs 41.6 with no blog (~33x higher)

Sites with more content are crawled far more often, indicating that AI systems may return to sources that offer a larger supply of usable information to draw from when generating answers.

Local Business Schema Completeness = More Crawling

This part of the research focuses specifically on local business schema, comparing how the completeness of schema implementation for communicating business details relates to AI crawler activity. The fields measured include business name, phone number, address, hours, and social profiles.

  • No local schema fields: 55.2% crawl rate
  • 10–11 completed schema fields: 82% crawl rate
  • Sites with more complete local schema show a 26.8 percentage point higher crawl rate (82% vs 55.2%)

Sites that provide more complete local business information in structured form are crawled more often and receive more crawler visits. As more of these fields are filled in, both crawl rate and crawl frequency increase.

The data shows that clearly defined local business data makes a site easier for AI systems to identify, verify, and subsequently revisit, all the prerequisites for receiving traffic from AI search.

Takeaways

AI crawling is a parallel method for content discovery and the research shows clear patterns for sites that are visited by crawlers most often.

  • AI crawling operates alongside traditional search, changing how content is accessed and reused
  • Sites with structured local signals, deeper content, and more complete schema are crawled more often
  • Multiple reinforcing signals appear together on the same sites, not in isolation
  • The data shows direction, not causation, but the patterns are consistent

The data shows that sites that make it easy for AI crawlers to index and revisit the them tend to perform better. Interestingly, sites that present clear, structured, and verifiable information, while continuing to build real audience demand, are more likely to be revisited by AI systems and benefit from traffic generated through AI search.

Read the research: Duda study finds AI-optimized websites drive 320% more traffic to local businesses

Featured Image by Shutterstock/Preaapluem



Source link

RSVP