X

Mystery with Robots file

There are a number of websites seeking to block search bot access to pages on their domain have been employing robots.txt to do so. While this is certainly a fine practice, but there are a few misunderstandings about what blocking Google/Yahoo/Bing/other search bots with robots.txt does. Here’s a quick breakdown:

  • Block with Robots.txt – do not attempt to visit the URL, but feel free to keep it in the index & display in the SERPs.

Here’s a quick example of a page that is blocked via robots.txt but appears in Google’s index:

(note that this robots.txt is the same across about.com’s other subdomains, too)

You can see that about.com is clearly disallowing the /library/nosearch/ folder. Yet, here’s what happens when we search Google for URLs in that folder:

Notice that Google has 2,760 pages from that “disallowed” directory. They haven’t crawled these URLs, so they appear as mere address strings (no title, description, etc – since Google can’t see the pages’ content).

  • Block with Meta NoIndex – feel free to visit, but don’t put the URL in the index or display in the results
  • Block by Nofollowing Links – not a smart move, as other followed links can still put them in the index (it’s fine if you don’t want to “waste juice” on the page, but don’t think it will keep bots away or prevent it from appearing in the SERPs)

Now think one step further – if you’ve got any number of pages you’re blocking from the search engines’ eyes, those URLs can still accumulate links, accumulate juice and other query-independent ranking factors, but they have no way to “pass it along” since their own links out will never be seen. I’ll illustrate the situation:

There’s two real takeaways here:

  1. Conserve link juice by using nofollow when linking to a URL that is robots.txt disallowed
  2. If you know that disallowed pages have acquired link juice (particularly from external links), consider using meta noindex, follow instead so they can pass their link juice on to places on your site that need it.

 
Need Help with Robots file? Contact SEO India Company, Profit By Search, Today.

admin: