Correct Way To Use Robots.txt File

by Hiroshi on November 7, 2008

in SEO, Web Guidelines

Suppose You want to Disallow two directories at your website i.e. some-directory and another-directory located at:

http://www.yourwebsite.com/some-directory
http://www.yourwebsite.com/another-directory

to be crawled by search engines and Allow all other directories and page of whole website to be indexed and crawled then here is how to do it in a right way.

Wrong Way

This is wrong way of creating a robots.txt file for your server for above requirement.

User-agent: *
Disallow: /some-directory/
Disallow: /another-directory/

In above example: You disallowed two directories but you did not allow anything other than that at all.

Correct Way
User-agent: *
Disallow: /some-directory/
Disallow: /another-directory/
Allow: /

In above example you just disallowed desired two directories and allowed all other website directories and pages.

This confusion about the code is very common. We generally read for robots.txt file and get the code and add * and mention what to disallow and upload the file.

To allow all the website for all search engines and all their crawlers (A search engine may own more than one crawler or bot for different tasks) here is the code:

User-agent: *
Allow: /

Google Webmaster Tools are very helpful in doing right thing what your website deserves. From there you can generate robots.txt file by choosing some options. I would recommend Google Webmaster Tools to every webmaster.

Related Posts

Previous post:

Next post: