|
Hello
I have been trying Google to crawl my site for week however they keep replying to say it is an issue with the site.
I installed and generated a robots.txt file under my public html
http://www.magentocommerce.com/magento-connect/robots-txt-6783.html
However I see with this extension you have the ability to decide what parts of your site you want search engine to Crawl and disallow others.
For obvious reason I dont want any admin and other sensitve areas being crawled.
at the moment all crawlers are enabled
User-agent: *
Crawl-delay: 5
Disallow:
Disallow: /404/
Disallow: /app/
Disallow: /manage/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalog/product/gallery/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /index.php
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /get.php
Disallow: /.js$
Disallow: /?___from_store=
Disallow: *___from_store=
Disallow: /?mode=
Disallow: /?limit=
Disallow: /?dir=
Disallow: /.css$
Disallow: /.php$
Disallow: /?p=*&
Disallow: /?SID=
Disallow: /.php$
Disallow: /rss*
My question:
1.Is there any other folder or directory I need to disallow?
2. How to I get it disallow my admin section please, which has been renamed from
www.mysite.com/admin
to
www.mysite.com/xyz084
1. Would really appreciate if anyone can clarify and help me on this, as its the last thing before I have a fully operational site. taken ages to create and setup.
2. What permission should the robots.txt file should be given?
Kind Regards
|