What should I put in my eCommerce Robots.txt

Robots.txt is still important, and still used today. For all of my eCommerce Magento stores, I am using some specific parameters in the robots.txt that ensure my javascript files are hidden, my SID parameters are hidden and I don’t have a bunch of duplicate content indexed in Google. This all helps with the SEO process and increasing your overall rankings.

Magento Robots.txt

Here is the code I use, just copy this and place is in your main root of the directory. Be sure to change the parameters below (anything with a # is ignored) so just change the location of your sitemap, mine is schoolofuniforms.com — CHANGE THAT!

magento-robots.png

[/xml]
# $Id: robots.txt,v magento-specific 2010/28/01 18:24:19 goba Exp $
#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:    http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html

# Website Sitemap
Sitemap: http://schoolofuniforms.com/sitemap.xml

# Crawlers Setup
User-agent: *
Crawl-delay: 10

# Allowable Index
Allow: /*?p=
Allow: /blog/*
Allow: /catalog/seo_sitemap/category/
Allow:/catalogsearch/result/
Allow: /blog*

# Directories
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/

# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/

# Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt

# Paths (no clean URLs)
Disallow: /*.js$
Disallow: /*.css$
Disallow: /*.php$
Disallow: /*?p=*&
Disallow: /*?SID=
[xml]
  • TY
    It's not always a reliable service. For $2 an article you are going to get a crappy half original content and no name to attach behind it. Plus you might be given the wrong info when it comes to testing. People will give you the kind of info you want just to get the hit, not necessary the truth.

    I think it's a useful service, but not to that extent.
blog comments powered by Disqus
Close