Chad Hetherington

YouTube recently became the second-most-cited social platform in Google’s AI-powered search experiences, according to an OtterlyAI study. Together with Reddit, which is No. 1 overall, the two websites dominate AI search citations from social media platforms, accounting for 78.2%. The remaining 6 platforms (LinkedIn, TikTok, Instagram, Facebook, X, Quora) combine for only 21.8% of citations in the category.

It’s clear that YouTube plays an increasingly important role in AI visibility — particularly within Google’s AI products and Perplexity, where YouTube is cited the most in generated results by far. But why are AI crawlers so hungry for video content? And how can you take advantage to improve your brand’s SEO, generative engine optimization (GEO) and online visibility strategies?

I spoke with Jodie Bedeker, Manager of SEO at Brafton, to learn what’s driving this huge uptick in video citations, and the tactics and GEO best practices to help get your brand in front of more faces within AI search using YouTube content.

What Kinds of YouTube Videos Does Google Cite?

Google’s crawlers aren’t just plucking any YouTube video to serve as a supplement to its generative text-based answers. Much like written content, the algorithm favors direct, educational and structured videos that provide value.

According to Jodie, AI systems — especially Google AI Overviews and LLM-driven search experiences — appear to favor videos that are:

  • Educational.
  • Concise.
  • Highly engaging.
  • Structured for easy interpretation by crawlers.
  • Directly answering specific queries.

These characteristics are common in specific types of video content, which means video style and format play a significant role in crawlability.

“How-to content is probably your biggest category that’s getting cited,” said Jodie. “Anything that’s addressing longtail queries, like ‘How does this work?’, ‘How do I do that?’ or ‘What does this mean?’ That’s the type of content getting picked up by LLMs.”

Does Video Length Matter?

Research suggests that long-form video content outplays YouTube Shorts when it comes to AI citation in general. However, shorter videos still offer value within Google’s AI ecosystem in particular, where AIOs and AI Mode cite YouTube Shorts significantly more often than other AI engines.

While short clips are still a relatively small share of overall citations compared to longer video content, they offer value far beyond that purpose when it comes to engagement.

“Higher engagement metrics are often associated with channels that have shorts that are typically 30 to 90 seconds long at the most,” said Jodie.

A straightforward approach to video creation for AI visibility, then, would be to gain citations with long-form content and keep captured users engaged with shorter videos. It’s a virtuous cycle from there, as shorter videos that boost engagement tend to drive users toward full-length videos, notes Jodie. An effective strategy needs both types of video content.

Do Thumbnails Matter?

AI crawlers are indeed capable of scanning thumbnails thanks to AI Optical Character Recognition (OCR) technology — mostly for text, but it’s also able to piece together context from images.

“We assume thumbnail design is very user-centric (which it naturally is),” said Jodie, “but AI crawlers have advanced enough to read image text overlays and identify objects on thumbnails — and then use that insight to determine whether the video matches the query.”

Despite the clickbait-style thumbnails you’ve probably seen on YouTube, best practice for getting picked up by AI search tools is to use text strategically and have clear, compelling visuals that users want to click — and crawlers can scan easily.

How To Optimize Videos for AI Visibility

The throughline here is clear: Video is becoming more than just a supplementary content format. Platforms like Google are increasingly surfacing YouTube videos in AI search experiences, creating new opportunities for brands to improve visibility through video content.

It’s not enough to just press ‘Publish’ on a brand video, though. Marketers need to think carefully about how videos are structured, labeled and presented so both users and AI systems can quickly understand and navigate them.

Include Informative Metadata

Video optimization for AI visibility extends beyond just shooting the video.

“The structured text and metadata are particularly important for Google Search results as a first step to assessing video content for citations. If the metadata isn’t properly formatted, Google’s crawlers may never even analyze or fetch the video at all,” said Jodie.

AI systems use this metadata as an initial filtering layer — scanning structured text first before deciding whether to analyze the video further, according to Jodie.

More than just an in-ecosystem SEO consideration, YouTube video metadata is becoming part of the infrastructure that helps AI systems interpret, categorize and retrieve video content in search experiences. To that end, it needs to be clear, structured and optimized — just like traditional website-based SEO content.

Video Structure and Accessibility

AI systems attempt to navigate and extract information directly from videos as efficiently as possible, which means structure and accessibility are paramount.

“The crawler scans through the text data [titles, descriptions, and tags] first to create a shortlist. Then, it ‘listens’ to the videos on that shortlist to verify context before deciding what gets pulled into search results,” said Jodie.

When a crawler “listens” to a video, it, of course, isn’t really listening to it but interpreting its content based on a few core levers that uploaders have complete control over.

Timestamps

Once crawlers have that shortlist, they dig deeper into each video to choose the most suitable option. This is where elements like timestamps and captions come into play.

“LLM crawlers will know where to go immediately,” said Jodie.

Timestamps help AI systems — mostly Google’s crawlers — identify where specific topics are discussed within a video. They act like chapters, each of which can be independently cited within AIOs. That means one longer, appropriately timestamped video can receive multiple unique AI citations. In fact, timestamped videos commonly receive 2-5 citations, according to Otterly.ai’s study.

This navigational tool, originally designed for human viewers, has become an important element in optimizing for AI search, too. They effectively reduce the need for AI systems to “deep dive” through entire videos to locate relevant information when judging content’s suitability for an AIO.

Subtitles & Transcripts

For similar reasons, video subtitles and transcripts are becoming more vital for AI visibility. Jodie cautions against relying entirely on auto-generated captions, however, as they’re often messy, too literal or entirely incorrect and therefore difficult for AI systems to interpret effectively.

“Having a cleaned-up, clearly structured transcript is going to be very valuable for LLM crawlers in particular,” she said. That means generating transcripts outside of YouTube or writing them manually, and always having a human check them over for accuracy.

Addressing the Target Query

It’s no surprise that AI systems value clear and direct answers to user questions. That’s true of written content, and is becoming increasingly important for video content too, if you hope to get cited in an AIO. Typically, videos should aim to get to the point quickly — within the first 30 seconds, advised Jodie. Videos that directly address queries (and quickly) align with how AI-driven search systems surface content.

Videos optimized for AI visibility are not just visually engaging; they’re structured, navigable and easy for machines to interpret. That means using accurate and descriptive metadata — from descriptions to transcripts — and being mindful of the query the video is designed to target, addressing it quickly or otherwise having organized and accurate timestamps to guide crawlers.

The Future of GEO Is Multimodal

Marketers, writers, SEOs and strategists are no strangers to multimodal content. But much of the history of the content industry has seen us use things like graphics and videos as supporting content for largely website- and therefore text-based optimization strategies. That’s changing.

AI systems are beginning to pull information from multiple content formats — like organic discussions on Reddit and YouTube videos — to generate answers and provide users with more context for their long-tail, more conversational queries.

That means YouTube is no longer just a branding or engagement channel for B2B and B2C businesses. It’s effectively becoming a searchable knowledge base for AI systems. Brands that invest in educational, well-structured and easily navigable video content now will be better positioned to improve visibility across AI-powered discovery experiences.