In This Article
Subscribe to our newsletter
“Perplexity’s grand theft AI” was the headline of The Verge's article accusing Perplexity.ai, an AI search engine startup, of being a rent-seeking middleman on high-quality sources.
The value proposition of traditional searches, such as Google's, was that by scraping the work done by journalists and publishers, the search results pointed traffic to source pages, thus bringing the major chunk of web traffic to their online properties. However, answer engines like Perplexity work by providing a direct answer, thus starving publishers of traffic.
Traditional search works by crawling the web and pointing users to pages that contain the queried information
An answer engine provides the queried information within the search page, thus eliminating the need to visit the source in most cases.
Perplexity is among a group of vampires, including Arc Search and Google itself (currently to a limited extent), that deprive publishers of ad revenue and keep it for themselves instead.
While these compete directly with search, chatbots, such as OpenAI’s ChatGPT and Microsoft’s Bing, can take away a portion of publishers’ search-driven traffic by generating answers to prompts without requiring a user to click through an article.
Think about the implications—if ChatGPT, Google Gemini, or other AI tools could eventually answer any question, most of your audience would never be reading your article.
Search is an important revenue driver. In a recent Digiday survey of publisher pros, most respondents said that search had a “moderately significant” to “very significant” impact on their annual outcomes. They also agreed that the impact of AI on search and referral traffic was their primary concern.
Source: Digiday, The State Of Publisher Traffic, Framing The Evolution And Impact Of Search And Referral In 2024
For publishers and marketers alike, a decline in search can practically upset every business metric:
Source: Digiday, The State Of Publisher Traffic, Framing The Evolution And Impact Of Search And Referral In 2024
Answer engines and AI chatbots are not the only threats. Digital media publishers today face various other challenges related to search traffic:
- Google is thrusting the Privacy Sandbox upon publishers, forcing them to upend how they earn advertising revenue from passerby web users using third-party cookies
- Made-for-advertising (MFA) sites, including Forbes, masquerade as prime real estate for online advertising, luring gullible advertisers into their web of trickery.
- AI-generated spam is flooding the Internet, stealing valuable search rankings and ad revenue.
- Apple is developing a web eraser/ad blocker that will be built into Safari, the most popular mobile web in the United States (the feature, though ready, has not been included in the public release yet).
- Walled gardens dominate time and attention, leaving little leftover for the open web.
The ad-supported open web, where a rich diversity of publishers could flourish, is facing extreme contraction. But what can digital media publishers do? What can they hope for in a future of AI-driven web?
This blog explores strategies publishers could implement and major developments to anticipate in the coming years as AI becomes a vital component of the online ecosystem.
What Can Publishers Do?
Answer Engine Optimization (AEO)
AEO is the practice of optimizing and structuring content to increase its likelihood of being used as source material for answer engines. Like SEO is for search engines, AEO is for answer engines.
Though these tools aim to provide direct answers, they often provide sources from where they pulled information, so while the opportunity to gain referral traffic will be much smaller than traditional search engines, it is still an opportunity nonetheless.
Publishers can use a host of AEO strategies, including:
- Identifying common questions: Think in terms of questions, not keywords. Understand your audience’s intent.
- Optimize for featured snippets: Google’s featured snippets are often the source of content for AI answers. They provide quick, authoritative answers to queries, exactly what answer engines seek. Optimize your content to be clear, direct, and structured so it can be featured in these snippets easily.
- Optimize for voice search: latest AI models, such as OpenAI’s GPT-4o, can reason across audio, vision, and text in real-time. This will eventually skew reader preferences toward voice. Optimizing for voice search involves targeting long-tail keywords and structuring content in a question-answer format that aligns well with how queries are posed in voice searches.
- Structured data implementation: Structured data, also known as schema markup, is a type of code publishers can add to their websites to help search engines and AI systems better understand content. It annotates content elements to clarify their purpose and the type of information they hold.
- Monitor and analyze reader behavior: helps determine how the audience perceives the content, whether it is handled well, and whether strategy needs to be adjusted.
Protect your “Protected” Content from AI Bots
The decision to block or allow AI bots is complex and requires careful consideration. Publishers must balance content protection, data security, and web visibility.
According to a Press Gazette analysis, just over 40% of the top 100 English news websites allow AI crawlers, while over half of the top 106 websites have blocked OpenAI. Either way, there are pros and cons.
Preventing AI bot crawling can reduce unauthorized content scraping, safeguard intellectual property, and ensure your work and unique creations are not used without permission. Blocking also eases server load because every bot that crawls adds up. It gives publishers complete control over the content and its use. They can decide who can access it and for what purpose. It also protects from unwanted associations, preserving and maintaining the brand's integrity.
Conversely, blocking AI bots, particularly those associated with search engines, can reduce visibility and indexing, resulting in missed opportunities. It also limits collaborative opportunities with researchers or readers interested in using data for legitimate purposes. Improperly configured settings can, at times, inadvertently exclude legitimate crawlers like data tracking and analysis bots, which can lead to missed opportunities for optimization and improvement.
Publishers must weigh the following before they decide whether to allow bot crawling:
- Assess specific needs and objectives: evaluate your site and content’s objectives, needs, and concerns. Consider factors such as the type of content, its value, and the potential risks or benefits associated with allowing or blocking AI bots.
- Explore alternative solutions: consider implementing measures that balance content protection and data availability. For example, rate limiting, user-agent restrictions, or implementing terms of use or API access limitations can help manage AI bot access while still allowing valuable data to be utilized.
- Regularly review and update robots.txt: ensure it aligns with your current strategy and circumstances. Assess the effectiveness of implemented measures and adjust as needed to accommodate changing threats, goals, or partnerships.
- Stay informed: keep updated with industry guidelines, best practices, and legal regulations regarding AI bots and web scraping. Familiarize yourself with relevant policies and ensure compliance with applicable laws or regulations.
- Consider collaboration opportunities: explore potential collaborations with AI researchers, organizations, or developers. Engaging in partnerships with mutually beneficial outcomes. Exchange knowledge, research insights, or other advancements in the AI field.
- Seek professional advice from SEO, legal, and AI specialists based on your needs and goals. Cloudflare, for example, offers a bot management service to control good and bad bots in real-time with speed and accuracy.
Blocking AI Crawler Bots involves making modifications to the site’s robots.txt file. This dedicated article explains the process in more detail.
Source: nixCraft
Pivot To Video
Pivot to video is not new. In 2016, news publishers lost their heads over video after Facebook pushed live and other formats with the promise of monetization. Many hired expensive video teams only to row back after interest waned.
Back then, media companies (incorrectly) assumed that:
- Video was—or very soon would be—the preferred format by the majority
- It would be easy and cheap to make short-form videos that the audience would enjoy
- The visual nature of the format would lead to naturally higher engagement rates
These weren’t true then, but they are now. In a dedicated article, we have covered the power of video in 2024 and why publishers must make this shift. Technological innovations, changing formats and production styles, and cultural shifts pioneered by social media content creators over the past two decades have created the perfect conditions for video.
A few years ago, video production was prohibitively expensive and required dedicated teams. Today, streamlined workflows leverage the power of Generative AI to produce video content significantly faster and cheaper.
AI-based services like Aeon, for example, help publishers convert their text-based articles to videos using AI workflows that follow their brand guidelines, match their style, and retain their journalistic voice. With Aeon, publishers can:
- Fine-tune voices to match their brand
- Effortlessly clip and crop videos for social media
- Introduce real emotion into text-to-speech videos
- Use AI-powered music integration to augment content appeal
- Automatically caption videos with a brand's style and more
Democratized access to such technologies allows publishers to scale efficiently and adopt a low-risk pivot-to-video strategy that expands the reach of existing assets (such as text-based articles) and increases engagement rates without needing dedicated teams or specialized skills.
Implement Subscriptions Models
The global subscription economy market size is projected to be $1.5 trillion in 2025, up from $650 billion in 2020. An average American has 4.5 subscriptions and spends $924 per year, with streaming being the most popular type of subscription.
According to Reuter’s Digital News Report 2024, with publishers pushing digital subscriptions, payment levels have almost doubled since 2014 from 10% to 17%. However, following a significant bump during the pandemic, growth has slowed.
Source: Reuter’s Digital News Report 2024
These subscription numbers may look encouraging, but let's face it - not every publisher provides enough value to warrant a subscription, and there is not enough of a market to support the number of publishers on the web today in the age of subscription fatigue.
However, subscriptions need not mean paywalls like those of FT or NYT, who employ fairly tough, restrictive strategies. Publishers can experiment with other models, such as:
- Freemium: divide content into free and premium, with premium referring to the articles blocked by a paywall and reserved for subscribers only.
- Metered: give readers a quota of content to access for free before being blocked by the paywall.
- Dynamic: adapts based on the user's profile or context to balance frustration and engagement. AI-based models can accelerate digital subscription acquisition, pricing, and retention.
License Content to AI Companies
This is the "if you can't beat them, join them" approach. We have seen a lot of traction in this area. OpenAI entered licensing agreements to feed its models with content from the Financial Times, Axel Springer, and more. Google cut a deal with Reddit for AI training data.
According to a leaked deck obtained by ADWEEK, OpenAI has been pitching partnership opportunities to news publishers through its Preferred Publishers Program initiative since July 2023.
Accurate and well-written news and content will remain the most valuable sources for training AI models. AI companies have been using human-created content without permission; however, the risk of legal action is now pushing them to negotiate business agreements with publishers. This approach, however, remains contested, and not all publishers are open to it.
What to Expect?
Regulations Around AI and Data Scraping
Answer engines depend on data collection from publishers to generate answers for their end users. We have already seen many lawsuits in the United States and the United Kingdom where publishers have accused AI companies of using copyrighted data without permission
Last month, Forbes discovered that Perplexity was dodging its paywall to provide a summary of the publication's investigation into former Google CEO Eric Schmidt’s drone company. The content was protected behind a hard paywall, yet Perplexity dodged it and barely cited Forbes. It even ganked the original art to use for its report.
Similarly, Wired has accused Perplexity of using a "secret IP address" to access content not meant for AI. To add insult to injury, Perplexity plagiarized Wired’s article about it, though it explicitly blocks Perplexity in its robots.txt file.
“Perplexity is not ignoring the Robot Exclusions Protocol and then lying about it,” said Perplexity cofounder and CEO Aravind Srinivas in a phone interview with FastCompany. “We don’t just rely on our web crawlers; we rely on third-party web crawlers as well.” Srinivas, however, did not name the third-party provider, citing a Nondisclosure Agreement.
AI innovators and copyright laws are on a collision course, particularly in the U.S. Neither side has satisfactory technological or legal solutions to enable AI-generated products to move forward. AI Firms have poured billions of research and development dollars into AI technology, but copyright law has not kept pace with its development.
The current copyright law framework, the Copyright Act of 1976, dates back to a time when mainframes and minicomputers ruled the IT world, and the internet and AI were more the subjects of science fiction than computer science.
On May 16th, a U.S. Senate committee released a 31-page report designed to serve as a roadmap for regulating AI. Despite a year of research, public seminars, and dozens of conversations with tech CEOs and academic researchers, the report failed to concretely address several important issues, such as the future of copyright law, regulation of AI models and training data, and issues surrounding open source AI.
In Europe, the newly introduced EU AI Act is set to enter into force in Q2-Q3, 2024, with transition periods for complying with various requirements ranging from 6 to 24 months.
These regulations will eventually shape the developments around AI applications like answer engines and AI chatbots, which may change the content on the web as we know it today.
AI Answer Engines May Not Be Sustainable
When users click on links, Google incurs almost no cost. Yet, advertisers must bid on the cost per click, providing Alphabet with huge profit margins. Generative AI shifts this model.
First, the results cost more because AI-related Q&A uses significantly more computing power than search queries. Second, they provide answers, not links, hence less granularity for advertisers.
If Alphabet were to abandon its search for a Perplexity-like product, its costs would rise, revenues would plummet, margins would suffer, and investors would head for the hills. This is where Perplexity, with no profits to jeopardize, hopes to find its bearing.
Perplexity has yet to find a dependable monetizing strategy. It has a subscription model but plans to introduce brand-sponsored questions soon. How this pans out remains to be seen.
Video May Become The Dominant Content Format
Several reasons point to why video format may dominate the future. We may soon see search engines like Google crawl and index video transcripts on platforms like TikTok to provide a result summary, particularly for question-led content.
The popularity of short-form, "reel" style videos on TikTok and Instagram has surged in recent years. These quick, consumable clips are generally 15-60 seconds long, employing bright, eye-catching captions, and are very effective in stopping the scroll to catch users' attention.
Live streaming and webinars are proven means of establishing authority. Done well, they are highly effective formats that pack value-rich information, allow for direct engagement, and create solutions for common pain points.
Amid all the noise online, there is a push for more personalization and authenticity. For this reason, video content will become even more valuable and important in marketing. Video allows people to put their faces in front of an audience, giving a targeted audience someone to connect with, remember, and trust.
Source: Forbes Advisor
In news media, most organizations plan to produce more videos, podcasts (and newsletters) in 2024, and broadly the same number of text articles as per a Reuter study:
Source: Reuters, Journalism, Media, And Technology Trends And Predictions 2024
This is also closely linked to monetization shifts. Formats that drive engagement and loyalty and support subscription models have become more valuable (short-form video is an exception, where investment is in response to attract younger audiences).
Summary
AI answer engines and chatbots potentially threaten traditional revenue models by diminishing website traffic from search engines. Publishers can mitigate this impact by optimizing their content for answer engines, safeguarding valuable content from unauthorized bot access, and transitioning to more engaging formats like video. The future of content will be shaped by technological advancements, regulatory changes around AI, and how audience preferences for content formats evolve.