Robots File – Usage and Best Practices
Sep 5, 2008 SEO, Web Guidelines, Web Tips
A robots.txt file restricts or allow Search Engine Robots (known as “bots”) that crawl the web. These bots are automated, and before they access pages of a site, they check to see if a robots.txt file exists that prevents or allow them from accessing certain pages. If you do not put a robots.txt file in your website root directory (/public_html or www) then in log files there will be a warning that you have not robots file in your website root whenever a crawler will visit your web pages. So its necessary to use Robots file and to use it in a right way.
How to create a robots.txt file
This example allows all robots to visit all files because the wildcard “*” specifies all robots:
User-agent: * Disallow:
This example keeps all robots out. No robot will visit your site and your pages will not be indexed by search engines:
User-agent: * Disallow: /
The next is an example that tells all crawlers not to enter into four directories of a website:
User-agent: * Disallow: /cgi-bin/ Disallow: /images/ Disallow: /tmp/ Disallow: /private/
Example that tells all crawlers not to enter one specific file:
User-agent: * Disallow: /directory/file.html
Crawl-delay directive
Several major crawlers support a Crawl-delay parameter, set to the number of seconds to wait between successive requests to the same server:
User-agent: * Crawl-delay: 10
Allow Directive
Allow file but disallow folder containing file, for example:
Allow: /folder1/myfile.html Disallow: /folder1/
Extended Standard
An Extended Standard for Robot Exclusion has been proposed, which adds several new directives, such as Visit-time and Request-rate. For example:
User-agent: * Disallow: /downloads/ Request-rate: 1/5 # maximum rate is one page every 5 seconds Visit-time: 0600-0845 # only visit between 06:00 and 08:45 UTC (GMT)
Also read How to Use Correct robots.txt file
More Information
http://www.robotstxt.org
Google Guidelines about robots.txt
Related Posts
Tags: bot, robots, SE, SEO, tips
Try Random Style!
Leave a Reply