In a rush? Jump straight to the TL;DR section!
Reviewing our “Canonical tags via GTM” test from last year
Google’s announcement surprised us, as our previous tests had suggested that canonical tags and other SEO-relevant elements were always pulled from the rendered HTML as soon as it was available, and no longer from the HTML source document. Had Google changed the way they dealt with JS-injected canonical tags? Or had our interpretation of the previous test results been wrong?
Our canonical tag test result from last year was based on just one URL, so we had pretty weak evidence to support the claim that Google uses canonical tags that are injected with JS. The outcome of our test could have just been a big coincidence: Google might have decided to canonicalise our test URL to the target of our JS-injected canonical tag for other reasons.
This is a general problem when testing whether canonical tags work or not: You can only really use canonical tags between very similar pages that are likely to be canonicalised anyway. Otherwise, you risk your canonical tags being ignored altogether. On the other hand, when your pages with canonical tags are canonicalised, you can’t be 100% sure that it’s really due to the canonical tags.
Note: If you’re thinking “Why don’t they just check the new GSC index coverage reports?”, please be patient and read on. We’ll get there.
In last year’s test, we injected a canonical tag into my author page on the English version of this blog, pointing to the main blog page. The author page has been canonicalised to the main blog page ever since, which can be verified by using an info: search operator for the URL in Google (screenshot from the 11th of May 2018):
The canonicalisation happened after we had injected a canonical tag with JS using Google Tag Manager and no other author or category pages on our website had ever been canonicalised in a similar fashion (with or without JS-injected canonical tags). Was this really just a coincidence?
Setting up our new test after I/O
After Google’s announcement at I/O, we wanted to replicate this exact result for more URLs, so we used JS and GTM to inject canonical tags pointing to the main blog page into four more category and author pages on our blog. We also made sure to leave two similar pages (my German author page and Michael’s English author page) untouched, in order to have a control group of URLs that do not receive any canonical tags pointing to other pages and that thus should not be canonicalised by Google. If one of our control URLs was canonicalised to the main blog page without us injecting a canonical tag, this could mean that the test URLs were canonicalised for reasons other than the JS-injected canonical tags.
Here are the four new test URLs for which we set up JS-injected canonical tags on the 11th of May, one day after Google’s I/O announcement (all screenshots taken on the 11th of May).
English “SEO experiments” category page:
English “SEO for website relaunches” category page:
German “SEO für Website-Relaunches” category page:
Michael’s German author page:
Note that the last test URL above, Michael’s German author page, was canonicalised to his English author page at the time we set up the canonical tag, an issue we often see with different language versions of pages that don’t have correct hreflang annotations. At the time of the screenshot, the hreflang annotations that can now be found on the page were missing. None of the other three pages were canonicalised at the time of the screenshots and, to our knowledge, they had never been canonicalised at any time in the past.
Waiting for the results
After injecting the canonical tags, we waited
impatiently. From our experience with previous tests, we knew that this needed time, as changes like these can take several months until they take effect. This is due to the fact that Google doesn’t render pages as often as it re-crawls them without rendering, and the less important a page is, the less frequently it is re-crawled, and even less frequently rendered. If you inject an SEO-relevant element like a canonical tag, hreflang annotation, or a “noindex” into the rendered HTML, but don’t include it in the HTML source document, you have to be prepared to wait for a long time before you see results.
Google starts canonicalising our test URLs
The first of our test pages to be canonicalised to the target of our JS-injected canonical tag was our English “SEO for website relaunches” category page. We noticed the change on the 21st of May, 10 days after injecting the canonical tag. Here is a recent screenshot of the corresponding info: search operator result:
Next, the English “SEO experiments” category page was canonicalised. This took significantly longer and we noticed the change on the 3rd of June, 23 days after injecting the canonical tag:
Our fourth test URL, Michael’s German author page, hasn’t yet been canonicalised at the moment of writing this (25 days after injecting the canonical tag), but we are very confident that this page will also be canonicalised to the target of the JS-injected canonical tag within the next month or so:
Update (13th of June 2018): This URL has now been canonicalised, 34 days after injecting the canonical tag.
Before jumping to conclusions, let’s talk about some other things that you should know about this test.
“Fetch and render > Request indexing” in GSC doesn’t seem to speed things up
After injecting the canonical tags on the 11th of May, we performed a “Fetch and render > Request indexing” for all of our test URLs:
On the 22nd of May, we requested crawling and indexing for our main blog pages and all linked pages, as all of our test URLs are directly linked from the main blog pages. On the 1st of June, we performed another “Fetch and render > Request indexing” for the three remaining test URLs that hadn’t been canonicalised at that stage.
We know that “Fetch and render > Request indexing” normally has an almost immediate effect when you use it to submit a new URL. It doesn’t seem to influence the rendering of a page though, or at least not always. Our request on the 1st of June might have triggered the canonicalisation of two of our test URLs on the 3rd and 4th of June, but the one on the 11th of May certainly didn’t have an immediate effect. Another factor might be that Google has to render a page more than once before it decides to pick up an injected canonical tag.
The new Search Console index coverage report does not tell the whole truth
At the time of writing this, our test URL from last year and our new test URL that was canonicalised on the 21st of May show up as “Submitted URL not selected as canonical” in the new Google Search Console index coverage report:
The two test URLs that were canonicalised on the 3rd and 4th of June still show up as “Submitted and indexed”, because the data GSC shows is not up-to-date, but I expect them to show up as “Submitted URL not selected as canonical” within the next few days.
When Google finds a canonical tag on a page and respects it, we would expect the status of the URL in the new index report to be “Alternate page with proper canonical tag”. What is happening here?
I have a simple theory to explain this: Google Search Console’s index coverage report behaves in the exact way that Tom Greenaway and John Mueller announced at and after I/O – It ignores canonical tags that are not in the HTML source document. So it reflects the behaviour that Google officially communicates, instead of the one that the results of this test reveal.
So Google Search Console has information about which URLs are indexed and it has a set of rules that it assumes Google uses for indexing. Then it uses this set of rules to generate reports. When the set of rules GSC uses for generating its reports is different from the rules Google actually uses for indexing, the reports show false information.
This seems to be the case with JS-injected canonical tags: Google does use them to canonicalise pages, but GSC believes it doesn’t. That’s why these URLs end up being marked as “Submitted URL not selected as canonical” instead of “Alternate page with proper canonical tag”. And this might also explain why John Mueller is 100% convinced that Google does not use JS-injected canonical tags:
— John ☆.o(≧▽≦)o.☆ (@JohnMu) May 11, 2018
I’m sure John Mueller knows exactly how GSC works and I’m also sure that the GSC team wants to provide accurate information. But if there’s a misunderstanding within Google, or simply a bug (JS-injected canonical tags aren’t supposed to be used, but they are), then even official statements and reports from Google can be wrong.
- Google has recently announced that canonical tags are not processed if they’re only found in the rendered HTML and not in the HTML source document.
- We tested this by injecting canonical tags into four URLs using GTM and our test results suggest that Google does use these canonical tags.
- It took more than three weeks for some of the tested URLs to be canonicalised to the targets of the JS-injected canonical tags.
- The “Fetch and render > Request indexing” feature in Google Search Console doesn’t seem to help speed up the rendering of pages.
- The new index coverage report in Google Search Console ignores JS-injected canonical tags in its reports and is thus in line with Google’s official statements.
- The reason why Google made an announcement that seems to be wrong might be due to an internal misunderstanding or a bug.
I’d be happy to hear your opinion on all of this! What did I miss? Where am I mistaken? Unfortunately, our blog comments are currently disabled until we manage to implement a GDPR-compliant solution (sorry about that!). Let’s talk on Twitter or wherever you prefer!