Robots.txt pre-check

Use case

Ensure that changes to the robots.txt file lead to the desired outcome and that no traffic or revenue are lost.

There is often an extremely high number of URLs, especially with large websites or websites with a lot of dynamic content. This means that Google no longer crawls all URLs and cannot index important sub-pages. To make optimal use of the “crawl budget”, it often makes sense in such cases to exclude pages via robots.txt.

It should be ensured that no relevant pages are excluded.

This is how we can help

Before actually changing the robots.txt, we crawl the website with the new version stored with us and compare the crawl with the last live crawl of the website. So we can, among other things, check the following structural factors:

  • Can content be indexed further with clicks or impressions – if not, which ones?
  • Have important distribution pages been omitted, and is content with clicks or impressions no longer linked internally?
  • Are there externally linked pages that are no longer crawlable for the Googlebot?
  • Have relevant pages fallen behind in the internal link structure?