Thanks for updating us Saggy,
I actually ended up developing my own solution for this, instead of using an extension. If anyone is interested in how I did it, here’s the gist of it:
Note: I just took this directly from my personal notes I kept, so you may need to fiddle with it to get it to work with your own theme.
All I did was add a conditional statement around the layered navigation block of code in my layered navigation template file that queries the user agent. For me this file was located here:
Add to line 1:
<?php if (strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot") || strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "bingbot") || strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "slurp") || strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "msn")): ?>
<?php else: ?>
Add to last line:
The search engines covered by the above user agent queries are Google, Bing, Yahoo and MSN.
You can test if it is working correctly by going to “Fetch as Google” under “Crawl” in Google Webmaster tools.
Once Google has fetched one of your category pages, click on the “Success” link to view the code as Google sees it. You should now no longer see the layered navigation code, or anything else you have chosen to hide.
Before implementing this strategy, I had around 20 000 pages of “duplicate” content thanks to the layered navigation. I’m happy to report that within a period of about 6 months, my number of indexed pages is down to less than 800 (I know, small site), and Google is no longer reporting any content issues under “HTML Improvements”.
Also, I can confirm a significant improvement in our rankings and traffic over this period, which confirms for me the value of avoiding duplicate content, even though Google claim they can figure it out on their own… yeah, right…
Oh, I should also mention that I used the “URL Paramenters” feature to further tell Google how to handle various parameters used by Magneto… just in case… For all layered navigation and most of the sorting parameters, I set it to “No URL’s”. If you are receiving significant traffic from Google organic search to your layered pages, you may want to use a different setting. For pagination (p) I set it to “Let Google Decide”, and for “limit” I set it to “Value = all”, as Google sometimes prefers the full page, not the first page.
I spent months on and off trying to find a solution for this, so I hope this will help save someone the time and hassle I had to invest.
All the best!