English  France













If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.



Reply
  #1 (permalink)  
Old 01-24-2009, 12:33 PM
UnderHost's Avatar
UnderHost Management
 
Join Date: Jul 2008
Posts: 581
Default The Proper Way To Use The robot.txt File

When optimizing your web site most webmasters don’t consider using the robot.txt file.

This is a very important file for your site.

It let the spiders and crawlers know what they can and can not index.

This is helpful in keeping them out of folders that you do not want index like the admin or stats folder.

Here is a list of variables that you can include in a robot.txt file and there meaning:
  1. User-agent: In this field you can specify a specific robot to describe access policy for or a “*” for all robots more explained in example.
  2. Disallow: In the field you specify the files and folders not to include in the crawl.
  3. The # is to represent comments
Here are some examples of a robot.txt file
User-agent: *
Disallow:
The above would let all spiders index all content.
Here another
User-agent: *
Disallow: /cgi-bin/
The above would block all spiders from indexing the cgi-bin directory.
User-agent: googlebot
Disallow:

User-agent: *
Disallow: /admin.php
Disallow: /cgi-bin/
Disallow: /admin/
Disallow: /stats/
In the above example googlebot can index everything while all other spiders can not index admin.php, cgi-bin, admin, and stats directory. Notice that you can block single files like admin.php.
__________________
UnderHostProviding Affordable and Quality USA Hosting & Offshore cPanel Hosting
24/7 Rapid Support / 99.9% Uptime Guarantee / Shared / Reseller / VPS / Dedicated
Premium VPSUSA Cloud Virtual Private Servers - Dedicated and Scalable Resources - Parallels® Virtuozzo
Hong Kong - Singapore - USA - Canada - Netherlands - UK - Germany - Panama - Malaysia
Reply With Quote
underhost.us
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Loading...


All times are GMT -4. The time now is 05:09 PM.
Copyright © 2010 UnderHost Networks Ltd






Partners
Underhost Facebook  Underhost Twitter

Copyright © 2011 UnderHost Networks Ltd