Publishers are finally getting a fighting chance against AI bots – but the battle is far from over.
Media Briefing – Digiday+
Every Thursday at 10 a.m. ET, Digiday+ members receive a curated roundup of the hottest media trends. Want more from the series? Check out https://digiday.com/series/media-briefing/
This week’s focus: the classic catch‑22 of blocking AI crawlers versus opening the door to licensing deals
Publishers are stuck in a perpetual dilemma: should they slam the digital doors on AI crawlers to protect their content, or leave a crack open in hopes of striking licensing and visibility agreements? The tension is palpable in every working group that brings together publishers and large language model (LLM) providers – from the IAB Tech Lab’s recent session to countless private roundtables. There’s no universal playbook, and the stakes are high because the decision isn’t just ideological; it’s a lever of bargaining power.
- Block the bots → you gain leverage in negotiations, but risk disappearing from AI‑driven discovery.
- Leave them in → you stay visible, yet you may be giving away the very value you hope to monetize.
That’s the classic catch‑22. But here’s where it gets controversial: recent moves by CDN giants and the rise of AI marketplaces are reshaping the rules of the game, giving publishers new tools to tilt the balance in their favor. While they’re still not the dominant player, the outlook is noticeably brighter than it was a year ago.
Tech partners are finally adding friction to the AI scraping process
Not long ago, a publisher’s only line of defense against unwanted AI scraping was a polite robots.txt file – a suggestion that most crawlers ignored. Add to that Google’s overlapping AI and search crawlers, and publishers found it nearly impossible to opt‑out of AI overviews without jeopardizing their search rankings.
Enter Cloudflare. In July, the CDN announced that publishers could default‑block AI bots at the edge. The move sparked a flurry of internal discussions among the biggest LLM providers, according to a publishing executive who asked to remain anonymous. Suddenly, publishers had a tangible way to say, “You can’t have my content unless you pay.”
Trusted Media Brands (TMB) is already leveraging Cloudflare’s bot‑blocking tools alongside Tollbit’s AI‑specific filters. While some sophisticated bots still slip through, these “gates” give TMB a stronger negotiating position when discussing licensing with LLMs. As VP of Business Development Jacob Salamon put it:
“It gives us a legal footing. If a major settlement ever occurs, we can point to the rates we set and argue we were bypassed, opening up a solid claim.”
The landscape is shifting fast. Three publishers we spoke with (who asked to stay unnamed) now juggle four or five LLM partners each – a stark contrast to the single‑partner model that was the norm just months ago. This isn’t solely because of Cloudflare’s actions; it’s also a sign that LLM providers are becoming more willing to discuss compensation for “retrieval‑augmented generation” (RAG) usage after the July default‑block announcement.
“There’s a collective deference to Cloudflare here,” says Justin Wohl, VP of Strategy at Aditude and former CRO of Salon. “Their scale lets them speak for the technology’s needs better than any individual publisher could.”
Robots.txt gets a makeover
In the past month, Cloudflare rolled out Content Policy Signals – an upgrade to robots.txt that lets publishers declare whether their material can be used for search, training, or AI inputs. Coupled with the Really Simple Licensing (RSL) protocol, publishers can embed machine‑readable licensing terms directly into the file, specifying not just access rules but also post‑scrape usage conditions.
Sure, a determined crawler can still ignore the file, but the move represents a step toward legally enforceable parameters. As one anonymous media‑group exec noted:
“Cloudflare is trying to create enough friction at the CDN level to give us leverage. The LLMs have the firepower to bypass most mechanisms, but the goal is to make the cost of ignoring us higher than paying us.”
LLMs are feeling the pressure to play nice
Big AI players are scrambling to secure content deals. Amazon now boasts an extensive roster of licensing partners for Alexa+ and its shopping assistant Rufus. Google is reportedly in talks with news groups, while Meta has opened its own licensing conversations. The catalyst? A growing realization that after the initial wave of web‑wide scraping, model performance is plateauing.
Brooke Hartley Moy, CEO of Infactory (which helps publishers monetize archives via AI‑ready APIs), explains:
“LLMs have hit a quality ceiling. They can’t keep improving without high‑quality human‑generated content. Free‑for‑all scraping just isn’t enough to drive the next leap.”
In other words, AI builders now see publishers as partners, not free data farms. The sentiment is echoed by two publishers who recently spoke with Digiday: Microsoft’s nascent AI content marketplace is deliberately positioning itself as a fair‑pay platform for a select group of publishers. The message is clear: you should be compensated for the value of your intellectual property.
Is Microsoft being altruistic? Probably not. As Wohl points out, Microsoft lost the search battle to Google and doesn’t want to lose the AI battle either. By building a marketplace that relies on premium content, they hope to gain a competitive edge.
“Microsoft’s struggles will be taken seriously, especially after Bing Chat and Co‑Pilot failed to capture significant market share versus ChatGPT,” Wohl adds.
What the next six months could look like
Expect a blur of standards, protocols, and infrastructure tweaks as companies race to define the economic model for the open web. Publishers, already outgunned in tech resources, are scrambling for every advantage.
“We’re outgunned on technology, resources, and engineering,” admits a senior publisher executive. “Big tech is chasing trillion‑dollar markets. If we don’t provide value to them, our very viability is at risk.”
What we’ve heard from the front lines
“It’s in the millions… 10‑30 % of our traffic now comes from bots.” – Head of Business Development, unnamed publisher, discussing weekly AI‑driven traffic.
Numbers that matter
- 26.7 % – The share of The Economist that’s up for sale by philanthropist Lynn Forester de Rothschild (source: Axios).
- $150 million – The price Paramount paid to acquire Bari Weiss’s Free Press (source: Wall Street Journal).
- 500 – Guardian staff members who convened to tackle a “transformative project” aimed at countering AI and referral‑traffic challenges (source: Press Gazette).
- Tens of millions – Queries handled by The Washington Post’s chatbot (source: Press Gazette).
What we covered this week
Publishers and tech giants push weekly talks on AI content use
- Amazon, Meta, Microsoft, Google, and 35 publishers gathered at the IAB Tech Lab’s LLM working group in NYC last Thursday.
- The group has moved to weekly meetings to hammer out standards for how AI should use and pay for content.
Read more: https://digiday.com/media/from-walls-to-frameworks-publishers-and-tech-giants-push-weekly-talks-on-ai-content-use/
AWNY 2025: Creators become the industry’s new power brokers
- Advertising Week highlighted the rise of content creators with four dedicated creator tracks – the most ever.
Read more: https://digiday.com/media/advertising-week-briefing-creators-emerge-as-the-industrys-new-power-brokers/
Trusted Media Brands: At the table but holding back
- TMB is negotiating AI licensing deals but is pausing on signing contracts until the scope of access is narrowed.
Read more: https://digiday.com/media/in-the-ai-dealmaking-rush-trusted-media-brands-is-at-the-table-but-holding-back/
The Economist’s post‑search, AI‑driven revenue plan
- The publication is investing in formats harder for machines to replicate (video, audio) and is taking a hard line against licensing deals with rival AI firms.
Read more: https://digiday.com/media/inside-the-economists-plan-to-grow-revenues-in-a-post-search-ai-driven-future/
What we’re reading
- Perplexity launches a publisher revenue‑share program – CNN, Condé Nast, Fortune, LA Times, and The Washington Post have signed up for a share of subscription revenue from Perplexity’s AI‑powered browser, Comet (Press Gazette).
- Washington Post exec editor on newsroom overhaul – Matt Murray discusses staff turnover, the overhaul’s challenges, and Jeff Bezos’s role (Status).
- The Root returns to Black ownership – Democratic strategist Ashley Allison acquires The Root from G/O Media (CNN).
- Daily Mail creates two social‑publisher arms – Newmedia and Creator Media aim to become homes for social news, entertainment, and lifestyle brands (The Media Leader).
- Hearst Networks launches Hearst Canvas – A new unit focused on digital‑first brands across YouTube, audio, and emerging platforms (World Screen).
And this is the part most people miss: while the technical tools are improving, the real power shift will come from collective publisher action. If you think a single outlet can negotiate a fair AI licensing deal on its own, think again. The future belongs to those who band together, set standards, and demand compensation.
What do you think? Should publishers continue to block AI crawlers until a universal licensing framework lands, or is it time to embrace selective openness and negotiate on a case‑by‑case basis? Drop your thoughts in the comments – we’re eager to hear where you stand!