- Tuesday, December 22, 2009
Canonicalization is the process of picking the best URL when there are several choices, and it usually refers to home pages. Canonical URL tag is part of the HTML header on any webpage, inside <head>...</head> section.
<link rel="canonical" href="http://www.example.com/" />
This tag tells all the major search engine bots (Google, Bing & Yahoo!) that the page in question should be treated as though it were a copy of the URL http://www.example.com/ and that all of the link & content metrics the engines apply should technically flow back to that URL.
Canonicalization helps to solve issues like www version vs. non-www version, re-directs, duplicate URLs, hijacking, etc... Avoiding duplicates in the search engine index has consistently been a key concern to all SEOs, Web Masters and Site Owners. When you use the <link> tag, you can indicate the canonical URL form for crawlers to use for each page of content, no matter how it was retrieved.
This puts the preferred URL form with the content so that it is always available to the crawler, no matter which session id, link parameter, sort parameter, parameter order, or other source of variance is present in the URL form used to access the page. When you have completely identical content, but with different URLs due to things such as a tracking parameters or a session ID:
Example:
This tag allows you to publicly specify your preferred version of a URL. This format provides you with more control over the URL returned in search results. It also helps to make sure that properties such as link popularity are consolidated to your preferred version.
So you can simply add this <link> tag to specify your preferred version (inside the <head> section):
<link rel="canonical" href="http://www.example.com/" />
And all the major search engine bots (Google, Bing & Yahoo!) will understand that the duplicates all refer to the canonical URL: http://www.example.com/.
Additional URL properties, like PageRank and related signals, are transferred as well (this is the best part I liked about it).
Best Practices for Canonical URL Tag:
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages.
A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the <link rel="canonical" href="http://www.example.com/" /> link element, the URL parameter handling tool, or 301 redirects.
Joydeep Deb is a Senior Digital Marketer and Project Manager with strong experience in Digital Marketing, Lead Generation, Online Brand Management, Marketing Campaigns, Project Management, Search Engine Optimization (SEO), Search Engine Marketing (SEM), PPC, eMail Marketing, Web Analytics, Web Technologies, Web Design and Development.
With an MBA in Marketing. IIM Calcutta Alumini. Lives in Bangalore, Karnataka - India.