The publishing industry has been claiming victory recently in a long-running disagreement with Google over how subscription content (i.e. content that sits behind a paywall or registration wall) should appear in their search results:
Google to ditch controversial ‘first click free’ policy [The Guardian]
There’s a lot of confusion around the new policy which Google has announced, and a lack of clarity in how the media, publishers, and Google itself is reporting and discussing the topic.
Google’s own announcement is typically obtuse in its framing (“Enabling more high-quality content for users”) but has plenty enough information for those of us who spend much of our professional lives interpreting the search engines’ moves to figure out what’s going on.
The short version is that what’s being reported as “ending” the first click free policy is really about extending it. There are some parts of the extension that publishers have asked for, but the key concession Google is demanding – that publishers label the paywalled content in a machine-readable way – will lead to further weakening of the publishers’ positions.
To run full the full analysis, I’m going to start with some background – but if you know all about the history, go ahead and jump ahead to my new analysis and conclusions.
The background – what was First Click Free (FCF)
In the early days of Google, they indexed only content that was publicly-available on the open web to all users and crawlers. They did this by visiting all pages on the web with their own crawler – named Googlebot. At various points, they encountered behaviour that they came to label cloaking: when websites showed different content to Googlebot than to everyone else. This was typically done to gain a ranking advantage – for example to stuff a load of text onto a page containing words and phrases that didn’t appear in the article users were shown with the objective of appearing in searches for those words and phrases.
Google disliked this practice both because it messed up their index, and – the official line – because it resulted in a poor user experience if someone clicked on one of these articles and then discovered content that was not relevant to their search. As a result, they declared cloaking to be against their guidelines.
In parallel, publishers were working to figure out their business models on the web – and while many went down the route of supporting their editorial business with advertising, many wished to charge a subscription fee and allow only paying customers to access their content.
The conundrum this presented was in acquisition of those customers – how would people find the paywalled content? If Googlebot was blocked at the paywall (like all other logged-out users) – which was the only legitimate publisher behaviour that wasn’t cloaking – then none of those pages would rank for anything significant, as Googlebot would find no real content on the page.
Google’s solution was a program they called First Click Free (FCF) which they rolled out first to news search and then to web search in 2008. This policy allowed publishers to cloak legitimately – to show Googlebot the full content of pages that would be behind the paywall for regular users by identifying the Google crawler and specifically treating it differently. It allowed this behaviour on the condition that the publishers allow any user who clicked on a Google search result to access the specific article they had clicked through to read whether they had a subscription or not. After this “first click” which had to be free, the publisher was welcome to enforce the paywall if the user chose to continue to request subsequent pages on the site.
Problems with First Click Free and the backlash
The biggest problem with FCF was that it created obvious holes in publishers’ paywalls and led to the open secret that you could access any article you wanted on many major newspaper sites simply by googling the headline and clicking through. While complying with Google’s rules, there was little the publishers could do about this (they were allowed to implement a cap – but required to allow at least 3 articles per day which is beyond the average consumption of most paywalled sites by most users – and effectively constituted no cap).
Many publishers began to tighten their paywalls or registration walls – often showing interstitials, adverts, or enforcing a monthly quota of “first click” articles a given user was allowed – technically leaving them cloaking in breach of Google’s guidelines, and frequently providing a poor user experience.
Publishers also began to resent more generally that Google was effectively determining their business models. While I have always been concerned about exactly what will continue to pay for journalism, I always had little sympathy for the argument that Google was forcing publishers to do anything. Google was offering a way of cloaking legitimately if publishers were prepared to enable FCF. Publishers were always welcome to reject that offer, not enable FCF, and also keep Googlebot out of their paywalled content (this was the route that The Times took).
Earlier this year, the Wall Street Journal pulled out of FCF, and reportedly saw a drop in traffic, but an increase in subscriptions.
The coverage has been almost-exclusively describing what’s happening as Google ending the FCF program whereas it really sounds more like an expansion. Whereas before Google offered only one legitimate way of compensating for what would otherwise be cloaking, they are now offering two options:
Metering – which includes the option previously called FCF – requires publishers to offer Google’s users some number of free clicks per month at their own discretion – but now also allowing publishers to limit how many times a single user gets free content after clicking through from Google
- Lead-in – which shows users some section or snippet of the full article before requiring registration or payment (this is how thetimes.co.uk implements its paywall at the moment – so under the new rules they would now legitimately be able to allow Googlebot access to the full normally-paywalled content subject to my important notes below)
Google is imposing a critical new condition
However, both of these options come with a new limitation: in order to take part in the expanded scheme they now call Flexible Sampling, publishers must mark up content that will be hidden from non-subscribers using machine-readable structured markup called JSON-LD. Structured markup is a machine-readable way of providing more information and context about the content on a page – and in this case it enables Google to know exactly which bits of content Googlebot is getting to see only because it’s Googlebot (and the publisher is engaging in Flexible Sampling) and what will actually be visible to users when they click through.
And here’s the rub.
This new requirement is listed clearly in Google’s announcement but is getting little attention in the mainstream coverage – probably because it’s both a bit technical, and because it probably isn’t obvious what difference it makes to publishers beyond a bit of development work(*).
To me, though, this requirement screams that Google wants to do the same things they’ve done with other forms of structured markup – namely:
Present them differently in the search results
Aggregate and filter them
(*) Incidentally, the technical requirement that the JSON-LD markup declare the CSS selector for the paywalled content is one that we at Distilled predict is going to present maintenance nightmares for many publishers – it essentially means that any time a publisher makes a visual change to the user interface on any of their article pages, they’re going to need to check that they haven’t broken their compliance with the new Flexible Sampling program. These are often considerations of different teams, and it is very likely that many publishers will accidentally break this regularly in ways that are not obvious to them or their users. It remains to be seen how Google will treat such violations.
1. I’m convinced Google will label paywalls in the search results
My thinking here is that:
Hard paywalls are already labelled in Google News
Many other forms of structured markup are used to change the display in the search results (probably the most obvious to most users is the ratings stars that appear on many product searches – which come from structured markup on the retailers’ sites)
Especially in the case of a hard paywall with only a snippet accessible to most users, it’s a pretty terrible user experience to land on a snippet of content and a signup box (much like you see here if you’re not a subscriber to The Times) in response to most simple searches. Occasionally a user might be interested in taking out a new subscription – but rarely to read the single article they’re searching for right now
Point 3 is the most critical (1 & 2 simply show that Google can do this). Given how many sites on the web have a paywall, and how even the most engaged user will have a subscription to a handful at most, Google knows that unlabelled hard paywalls (even with snippets) are a terrible user experience the majority of the time.
I fully expect therefore to see results that look something like this:
Allow them to offer a scheme (“flexible sampling”) that is consistent with what publishers have been demanding
Let publishers claim a “win” against big, bad Google
Enable the cloaking that lets Googlebot through even hard paywalls (all but the most stringent paywalls have at least a small snippet for non-logged-in users to entice subscriptions)
Avoid having to remove major media sites from the search results or demote them to lower rankings
And yet, by labelling them clearly, get to the point that pretty much only users who already have a subscription to a specific site ever click on the paywalled results (the number of subscriptions you already have is small enough that you are always going to remember whether you have access to any specific site or not)
My prediction is that the end result of this looks more like what happened when the WSJ pulled out of FCF – reportedly good for the WSJ, but likely very bad for less-differentiated publishers – which is something they could already do. In other words, publishers have gained very little in this deal, while Google is getting them to expend a load of energy and development resource carefully marking up all their paywalled content for Google to flag it clearly in the search results. (Note: Google VP for News, Richard Gingras, has already been hinting at some of the ways this could happen in the Google News onebox).
2. What does aggregation look like?
Once Google can identify paywall content at scale across the web (see the structured markup information above) they open up a number of interesting options:
Filtering subscription content out of a specific search and seeing only freely-available content
Filtering to see only subscription content – perhaps from user-selected publications (subscriptions you own)
- Possible end-game: connecting programmatically to subscription APIs in order to show you free content and content you have already got a subscription for, automatically
Offering a bundle (Chris Dixon on why bundles make economic sense for both buyers and sellers). What if you could pay some amount that was more than a single subscription, but less than two, that got you access to 5 or 6 major media sites. It’s very likely that everyone (except publishers outside the bundle!) would be better off. Very few players have the power to make such a bundle happen. It’s possible that Google is one of those players.
Under scenario #3, Google would know who had access to the bundle and could change the display in the search results to emphasise the “high quality, paid” content that a particular searcher had access to – in addition to the free content and other subscription sites outside the bundle. Are we going to see a Spotify for Publishers? We should all pay close attention to the “subscription support” tools that Google announced alongside the changes to FCF. Although these are starting with easy payment mechanisms, the paths to aggregation are clear.
Ben Thompson has been writing a lot recently about aggregators (that link is outside his paywall – a subscription I wholeheartedly recommend – I look forward to seeing his approach to the new flexible sampling options on his own site, as well as his opinions). Google is the prototypical super aggregator – making huge amounts of money by aggregating others’ work with effectively zero transaction costs on both the acquisition of their raw materials and their self-service sale of advertising. Are they about to aggregate paid subscription content as well?
Publishers are calling this a win. My view is that the new Google scheme offers:
Something that looks very like what was in place before (“metering””)
Something that looks very like what pulling out of FCF looked like (“lead-in”)
And demands in return a huge amount of structured data which will cement Google’s position, allow them to maintain an excellent user experience without sending more traffic to publishers, and start them down a path to even more aggregation.
If paywalls are to be labelled in the search results, publishers will definitely see a drop in traffic compared to what they received under FCF. The long-term possibility of a “Spotify for Publishers” bundle will likely be little solace in the interim.
Are you a publisher?
If you’re wondering what you have to do as a result of these changes, or what you should do as the landscape shifts, don’t hesitate to get in touch and we will be happy to discuss your specific situation.