For much of the past year, publishers have been playing defense against AI scraping and copyright uncertainty. But heading into 2026, some see reasons to believe the ground is finally starting to move a little more in their favor.
The Financial Times was the first U.K.-based publisher to strike a licensing deal with OpenAI in 2024. It has yet to agree to terms with another consumer LLM, but Matt Rogerson, FT’s director of global public policy and platform strategy, believes 2026 will bring a kind of reset as big tech companies alter their stance on AI licensing to avoid future legal risk. And he believes AI scraping is reaching a new phase.
“Every publisher has spent the last two years trying to close down all the loopholes, or perceived loopholes, in their website securities,” he said. “There are still gaps. There’s still really no big enough stick to stop entrepreneurs from using scraping for higher platforms to try and get behind paywalls and then scrape content from publisher sites. But I think that the net is tightening [around AI scraping].”
Digiday spoke with Rogerson about what’s changed — and why publishers may be entering a more constructive phase when it comes to AI remuneration.
We distilled some of his areas to watch in 2026, in an annotated Q&A. Answers lightly edited for clarity and flow.
On emerging AI-licensing revenue streams
Rogerson: “You’re starting to see an increasing number of institutions and [corporate] companies that are taking AI summarization licenses. They know that the materials inside AI models — for it to be valuable to them and their businesses — it has to be top quality content and has to be accurate, and it has to be from brands that they can see, know and trust. So I think those moves towards B2B licensing we’re seeing are really, really positive.”
Digiday: Enterprise AI RAG licensing with publishers is still a relatively untapped market, though the FT, The Economist and the Associated Press have all allowed access to their content via API, to corporate enterprise clients that operate private LLMs. While the revenue from this is nascent, all three publications are closely eyeing the rise in demand from the enterprise RAG sector as a pot of recurring revenue for the future.
It’s unlikely to replace ad revenue any time soon, but corporate enterprises are increasingly hungry for trusted, accountable content they can use safely – and reuse – without sharing the data outside their own private LLMs – a demand publishers like the FT are starting to see as a real growth lever.
On bring-your-own AI license as burgeoning trend
Rogerson: “The idea of bring-your-own license is really attractive, certainly when we’ve spoken to [U.K. parliament] ministers about this, I think the way we see the evolution is away from models being judged on how much they can steal, to what their inherent capabilities are, and how they should be judged on those.
“I think how we activate that as a bring-your-own license marketplace is really interesting, both for us and the B2B licensing space, but also from a consumer perspective. So if you can connect your FT subscription with an LLM of your choice, that becomes quite exciting, because it brings licensed content to those products…that’s where we see the market going.”
Digiday: BYOL is a standard industry term for companies that build and sell software. It’s less used among publishers. Think of it like this: instead of the AI developer paying the publisher (like it would with a RAG license) the enterprise buyer brings its own pre-authorized rights to use that specific content within its private AI environment. Rogerson is referring to how it could also be used by paying news publications’ subscribers, not just enterprises. So if you already subscribe to a news site, an AI assistant could check that and then safely use that paid content to answer your questions, without breaking the paywall or cutting the publisher out.
Rogerson: “What we’ve seen over the last three to six months is very large companies like Microsoft, developing paid marketplaces for grounding, and they sent very clear signals at publishing conferences that they see real commercial value in the content and the IP that we produce. And that they want to think about how they develop those paid marketplaces so they become sustainable and so that they can be used, not just by Copilot, but also other large language models that use your content for grounding.
“So that’s a big line in the sand. A big change in position. And other companies are also looking at similar things, like Meta recently having done AI licensing deals, Amazon is also doing deals around commercial marketplaces. So, those are really interesting developments where you can see that they’re changing tack quite significantly.”
Digiday: It’s been a hell of a few years for publishers that feel they’ve been fleeced by companies scraping for training data. But with the demand for real-time queries still high, more of the big-tech companies are at least starting to show signs of positive intent with regard to copyright. But there are a few caveats: no publisher yet believes that the loss of traffic caused by AI summaries driving down click-through rates, along with the wild west of data scraping that’s happened under AI companies’ “fair use” claims over the last few years, are being repaid in these licensing deals. And publishers generally, are divided over Meta and Microsoft’s intentions, as well as all other AI companies. But for now, the tide is moving in a better direction.
Rogerson: “We’ve already got long-standing relationships with Google, so I’d be surprised if we weren’t in conversations with them on how that evolves. Meta, I’m not aware of any conversations. Meta is obviously changing its approach, going from its open-weights model Llama to creating proprietary models in the same way that OpenAI has. They’ve also brought a lot of new people in, and they’re also facing quite a lot of court cases around how they’ve used copyright material in the past. So there are ongoing court cases in the U.S. around pirate libraries and unlawful access to content.
“I think there’s a misnomer that was put around over the past two years in the US that everything is fair use and there are no consequences to anything that’s happened. I think you’re seeing through some of the cases that flow through at the moment, like Bartz v. Anthropic even where a judge believes that there might be a claim that some of the use of that content is fair use, the unlawful access element means that Anthropic is on the hook for $ 1.5 billion.
“So I think those [court cases] will sharpen minds about how they’ll actually access that content in order to provide responses to queries.”
Digiday: He’s referring to the fact that while it’s encouraging to see Meta and others strike a more constructive tone on AI licensing, those moves are also likely aimed at reducing future legal risk as copyright dynamics begin to shift. Rogerson also said that the quality and transparency of data shared with the FT by OpenAI has been useful, whereas there is still opacity on what publishers can see via Google’s Search Console, which he believes all publishers would like to see change.
On Google’s AI and search crawler separation
Rogerson: “One of the things the CMA [Competition Markets Authority] is looking at is whether Google should have to divide its scraper between a scraper for the search index and then a scraper for AI-related activities. And the EU announced a similar sort of investigation where they looked at: should they be able to scrape once and then use that for multiple purposes.
“Given that they have the most data of any company, I think being able to opt out of uses of data for different purposes is absolutely essential. And that will be a big test for the CMA in terms of its first set of conduct requirements…does it [Google] provide that kind of optionality for IP owners to determine with absolute clarity how their content is used. I think if you did that, then you’d start to see the market would develop more clearly, because users of content would be on more of a level playing field, and no one company would have a more significant advantage because of their size.”
Digiday: This is something all publishers will likely watch with interest. Typically, if publishers use the Google-Extended token to block AI training, their content can still be used to generate live AI answers if it remains indexed for search. To completely stop Google’s AI from using their data, they would have to block Googlebot entirely, which effectively removes them from Google Search results and eliminates their primary traffic source. These concerns have led to a wave of legal and regulatory actions in 2025 including a formal antitrust investigation by the European Commission into Google’s use of publisher content for generative AI features.