XML Sitemap Best Practices
- Wednesday, January 06, 2010
Many websites have HTML sitemap for its human visitors, which list all the webpages in the website which helps a human visitor to find what he/she is looking for in the website, same ways XML Sitemaps are created for search engine spiders and not for human visitors.
XML Sitemap helps search engines crawlers to crawl the website and keep their results up to date. These XML sitemaps can be submitted to Google, Bing and Yahoo! search engines.
XML sitemaps do not guarantee all links will be crawled and being crawled does not guarantee indexing. However, XML sitemap is still the best insurance for getting a search engine to learn about your entire website.
In XML sitemap, errors are not tolerated, and so syntax must be exact. It is advised to validate the XML sitemap every time before going live.
How to create XML Sitemap
The Sitemap must:
- Begin with an opening <urlset> tag and end with a closing </urlset> tag.
- Specify the namespace (protocol standard) within the <urlset> tag.
- Include a <url> entry for each URL, as a parent XML tag.
- Include a <loc> child entry for each <url> parent tag.
All other tags are optional. For example: <lastmod>, <changefreq>, <priority>
<urlset>: Encapsulates the file and references the current protocol standard.
<url>: Parent tag for each URL entry. The remaining tags are children of this tag.
<loc>: URL of the page. This URL must begin with the protocol (such as http or https).
<lastmod>: The date of last modification of the file. This date should be in W3C Datetime format. [YYYY-MM-DD].
<changefreq>: How frequently the page is likely to change. Valid values are:
<priority>: The priority of this URL relative to other URLs on your site. The default priority of a page is 0.5. Valid values range from 0.0 to 1.0. Please note that the priority you assign to a page is not likely to influence the position of your URLs in a search engine's result pages.
Below is a sample XML sitemap:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://www.example.com/</loc> <lastmod>2010-12-31</lastmod> <changefreq>monthly</changefreq> <priority>0.7</priority> </url> </urlset>
Your Sitemap file must be UTF-8 encoded. As with all XML files, any data values (including URLs) must use entity escape codes.
Once your XML sitemap is ready upload it in your root folder. I strongly recommend to upload your XML sitemap in the root directory of your website. For example, if your website is 'http://www.example.com/' then your XML sitemap file should be at 'http://www.example.com/sitemap.xml'.
DO's and DON'Ts of XML Sitemap
- Your Sitemap file must be UTF-8 encoded.
- URLs of your XML sitemap must be URL-escaped and encoded.
- Validate your XML sitemap before going live.
- Upload your XML sitemap at the root folder of your website.
- Specify the location of your XML sitemap in robots.txt file
... Sitemap: http://www.example.com/sitemap.xml
- XML sitemap file must not have more than 50,000 URLs and not larger than 10MB.
- Specify URLs from the same domain and same protocol.