The Proper Way To Use The robot.txt File
| by Jimmy Whisenhunt | February 06, 2005
When optimizing your web site most webmasters dont consider using the robot.txt file. This is a very important file for your site. It let the spiders and crawlers know what they can and can not index. This is helpful in keeping them out of folders that you do not want index like the admin or stats folder.
Here is a list of variables that you can include in a robot.txt file and there meaning:
• User-agent: In this field you can specify a specific robot to describe access policy for or a * for all robots more explained in example.
• Disallow: In the field you specify the files and folders not to include in the crawl.
• The # is to represent comments
Here are some examples of a robot.txt fileUser-agent: *Disallow:
The above would let all spiders index all content.
Here anotherUser-agent: *Disallow: /cgi-bin/
The above would block all spiders from indexing the cgi-bin directory.User-agent: googlebotDisallow: User-agent: *Disallow: /admin.phpDisallow: /cgi-bin/Disallow: /admin/Disallow: /stats/
In the above example googlebot can index everything while all other spiders can not index admin.php, cgi-bin, admin, and stats directory. Notice that you can block single files like admin.php.
Here is a list of variables that you can include in a robot.txt file and there meaning:
• User-agent: In this field you can specify a specific robot to describe access policy for or a * for all robots more explained in example.
• Disallow: In the field you specify the files and folders not to include in the crawl.
• The # is to represent comments
Here are some examples of a robot.txt fileUser-agent: *Disallow:
The above would let all spiders index all content.
Here anotherUser-agent: *Disallow: /cgi-bin/
The above would block all spiders from indexing the cgi-bin directory.User-agent: googlebotDisallow: User-agent: *Disallow: /admin.phpDisallow: /cgi-bin/Disallow: /admin/Disallow: /stats/
In the above example googlebot can index everything while all other spiders can not index admin.php, cgi-bin, admin, and stats directory. Notice that you can block single files like admin.php.
Article Source: http://www.articleset.com

You are welcome to publish or reprint this article free of charge, provided:
- you include the entire article, unchanged, including the "About The Author" box
- all hyperlinks remain active, including the bottom ArticleSet.com link (does not apply to print publications)
- you agree not to hold the authors nor ArticleSet.com liable for any loss profits, expenses, or any other damages resulting from the use or misuse of articles published on this website