X

Google launches new bot

Google launches new bot for crawling News Googlebot-News

Google has announced a new user agent for robots.txt called Googlebot-News that gives publishers even more control over their content.

Publishers could easily contact Google if they didn’t want to be included in Google News but did want to be in Google’s web search index. Now, publishers can manage their content in Google News in an even more automated way. Site owners can just add Googlebot-News specific directives to their robots.txt file. Similar to the Googlebot and Googlebot-Image user agents, the new Googlebot-News user agent can be used to specify which pages of a website should be crawled and ultimately appear in Google News.

Here are a few examples for publishers:
Include pages in both Google web search and News:
User-agent: Googlebot
Disallow:
This is the easiest case. In fact, a robots.txt file is not even required for this case.
Include pages in Google web search, but not in News:
User-agent: Googlebot
Disallow:

User-agent: Googlebot-News
Disallow: /
This robots.txt file says that no files are disallowed from Google’s general web crawler, called Googlebot, but the user agent “Googlebot-News” is blocked from all files on the website.
Include pages in Google News, but not Google web search:
User-agent: Googlebot
Disallow: /

User-agent: Googlebot-News
Disallow:
When parsing a robots.txt file, Google obeys the most specific directive. The first two lines tell us that Googlebot (the user agent for Google’s web index) is blocked from crawling any pages from the site. The next directive, which applies to the more specific user agent for Google News, overrides the blocking of Googlebot and gives permission for Google News to crawl pages from the website.

Block different sets of pages from Google web search and Google News:
User-agent: Googlebot
Disallow: /latest_news

User-agent: Googlebot-News
Disallow: /archives
The pages blocked from Google web search and Google News can be controlled independently. This robots.txt file blocks recent news articles (URLs in the /latest_news folder) from Google web search, but allows them to appear on Google News. On the other hand, it blocks premium content (URLs in the /archives folder) from Google News, but allows them to appear in Google web search.

Want to acquire top positions for your targeted keywords on Google and Yahoo? Contact SEO India Company, Profit By Search, Today!

admin: