Duplicate Content Solutions & The Canonical Tag – SMX Advanced Coverage 2009

Duplicate Content Solutions & The Canonical Tag up next – here’s the session description:

Duplicate content has long been a worry of the SEO pro. Recently, the search engines introduced a new tool to help combat duplicate content issues: the canonical tag. This session looks and how the tag has been performing for some webmasters plus revisits other duplicate content tools and techniques.

This was a great session, particulary in Q&A session where Matt Cutts came on stage to help out Maile and Nathan. The search engine guys seem really against rel=”nofollow” on internal links!

Alex Bennert, In House SEO, Wall Street Journal introduces her speakers, first up:

Jordan Glogau, Enterprise Search and Business Development gives two case studies: 1800Flowers.com and eyeglasses.com

Jordan gave a roundup of the canonical link element and started by introducing what it is and how it is dealt with by the “big three” search engines. Jordan states that the best use of the canonical tag is to revitalise internal discussions about site architecture. The existence of the tag can actually be a good lever to pull if you want to get your internal team thinking about the your site and whether it’s worth pursuing development on a better site architecture.

He took us through 1800flowers.com, an old site with level 1 load balancing and a great deal of duplicate content issues. He showed us how badly the duplicate subdomains had been indexed thanks to the load balancing. His other biggest problem was that noone knew the code for the site anymore! After implementing the canonical tag, they saw a 20% YOY growth in revenue from organic search. They also got rid of the old load balancing meaning there was one single www subdomain.

In summary, Jordan recomended the canonical tag for old sites as an easy way to solve some pretty serious problems, taht would otherwise require a total web site rebuild.

His next case study, eyeglasses.com had very similar problems. The CMS was highly search engine unfriendly and there was a raft of problems introduced by the sort functionality in the product pages. Jordan’s solution was to always default to the canonical on the product pages. There’s an example of how this could be applied in my SMX London presentation here. It’s too early to tell the results for eyeglasses.com but he seemed very confident that he’d see similar results.

Adam Audette, Founder, AudetteMedia, Inc.

Adam opened with a question – how many people in the audience are using the canonical tag? Nearly half the audience are using it!

Adam stated that the canonical tag has the potential to break things. It’s easy to implement, and appears to work and can be used in a variety of ways. Excellent for temporary fixes. He warned us against some of the “bad” points about the tag, particulary that it’s like a “poor man’s” 301.

Adam talked about his URL tool at Zappos very briefly – the message being that you should handle your site with 301s rather than always choosing a canonical tag.

He gave some product page examples where duplicate URLs are created by sort options and subdomains.

Possible link canonical usage:

Duplicate pages with lots of link parameters
Subdomains, eg: reviews.subdomain.com, www.subdomain.com
Multiple versions of the Google directory (looked at search results for “clothing”)

Adam warned us to be extremely careful when implementing the tag, particulary when a page with a canonical tag references a page that 302 or 301 redirects. Great deck, Adam!

Stephan Spencer, Netconcepts

Stephan always goes fast!

He starts by telling us that the canonical tag consolidates pagerank. It’s a great addition to the SE arsenal, but it should not be relied upon. Instead a 301 redirect should be favoured.

Use XML sitemaps to declare your canonical URLs and rel=nofollow the non canonical. His general theme is to avoid leakage of pagerank on your site by thinking carefully about nofollows, robots.txt etc. Conditional 301ing was recomended as a big “avoid”.

Northernsafety.com have many non-canonical pages indexed, regardless of the canoical tag on the site. Check out their site index here. Stephan showed us the drill down process of identifying the non canonical version urls in the site index. We looked at possible reasons why the canonical tag was being ignored. It looked like the robots.txt file was a possible cause.

He also took us through Wikipedia’s podcasting page and gave us a run through of why this result is being ignored.

Scenarios covered:

- Excessive pagination
- Next / previous pages
- Keyword rich pathways. Disallow / nofollow all links that use the terms “view all” or price range links and send the pagerank via the keyword rich route eg: category links, brand terms or product names.

Stephan covers the canonical tag in an excellent post here. Definitely check it out.

A really nice point that Stephan made was to avoid tracking parameters such as “&source=” was to include your analytics tracking strings in your URL rewrites. He concluded by taking us through redirect rules for different canonicalization issues such as trailing slashes and covered how to do conditional redirects. Not recommended especially as Maile is on this panel!

Manufacturer supplied copy can be a big duplicate content problem owing to the lack of uniqueness of the content. Stephan covers solutions for this problem and highlights the problems introduced by consumer review plugins such as Bazaarvoice. The unique content is hidden in javascript so you need to fix that problem to “uniquify” your content and make your thin affiliate site appear more unique.

Maile jumped in before Q&A to clarify on the use of the tag – particularly noting that some people are attempting cross domain canonical tagging, which does not work. She also asked for feedback on how the tag is working, as they take the tag very seriously. Finally, she mentioned that using nofollow to reduce duplicate content is not going to work. Stephan disagreed by saying if you nofollow half the links on a page then the others would recieve more pagerank. Nathan added that ranking is more complex than simply adding nofollows and they don’t recommend you spend your time nofollowing links on pages or “pagerank sculpting”. At this point, Matt Cutts jumps on stage!

Matt says, that he agrees with Maile’s comments. A lot of people did experiments with nofollow pagerank sculpting and he pointed out that the results for that approach may be different now as the search engines adjust their algorithms to compensate for overuse of the approach. Matt said that you can use nofollow on links to pages such as “register” but that you shouldn’t put too much time into nofollowing internal links. It’s pretty clear from this conversation that Matt is telling us that this sort of activity just doesn’t make as big a difference as it once could have. The feeling from us in the audience is that the algorithm is smart enough to understand this type of webmaster behavior and that you might not necessarily get the result you’re looking for.

Matt did recommend that the canonical tag is a good way to go with duplication problems associated with some of the scenarios covered by the panel. A great session overall, where the search engine guys really played down the value of a rel=”nofollow” tag.

Comments

  1. Optimizacija

    The canonical tag works fine with pages where (almost) whole content is duplicated. But I have problems with pages where only paragraph is duplicated content from several other pages. I think this is not considered as duplicated content (yet), but it should be.