As SEO evolves, it gets more and more human-centered and people in the industry begin to question old techniques, targeted mostly at search engines. Such is the case with Sitemaps, which have been around for ages.
Do sitemaps still matter for SEO in 2019 or are they just a waste of time? When and why should you use them? And how can you optimize them for maximum SEO results?
In this article, you’ll find out the answers to all those questions, so keep reading!
- What Are Sitemaps & How Do They Work?
- Are Sitemaps Important for SEO?
- How to Add a Sitemap to Your Website
- How to Optimize Your Sitemaps for SEO
What Are Sitemaps & How Do They Work?
Sitemaps are files used to tell search engines about pages that are available for crawling on websites. These files are simply just a list of URLs that contain some extra information about the pages, such as when they were last updated, for example.
There are two types of sitemap formats, but they can be split into multiple categories, depending on their purpose.
First, we have HTML sitemaps and XML sitemaps.
HTML sitemaps are basically just web pages containing href tags which link to other pages. They are useful for users when seeking something but also for search engines. Crawlers discover your website by ‘clicking’ from link to link until there are no more new links to be found.
If all your website’s links are in a sitemap, be it HTML or XML, search engines will find those pages more easily.
A HTML sitemap would look something like this to a user:
The HTML code in the backend would be pretty simple as well:
<li><a href=“/product-1”>Product 1</a></li>
<li><a href=“/product-2”>Product 2</a></li>
Of course, you’re free to add to the HTML sitemap page any CSS you want to style it, as well as a navigation menu, structure or titles.
A good sitemap example in HTML format can be found on the Disney Store. As you can see, all the important categories are listed there and you can basically browse the entire site from that page alone.
However, the standardized format for distributing sitemaps is the XML format. Search engines use these to read more information about the page, such as the title, directly from the sitemap file.
You can also upload XML files to tools such as the Search Console (former Google Webmaster Tools), where Google will validate it and check it from time to time.
An XML Sitemap is a little bit more difficult to write, as it contains all the metadata about the page in a standardized format. While visually the XML sitemap might not look very different, at a core level you can immediately see they are a lot more complex:
You can see some special tags there such as
Note that the URLs in the XML sitemaps are absolute. This means that you can’t just add /your-page but you must add https://yoursite.com/your-page instead.
If we go back to our Disney example, we can see that the site also has a XML Sitemap, targeted at search engines.
You can go one step further and show sitemaps only to Search Engines. You can differentiate via the user agent and show an HTML sitemap instead if a real person visits the page.
Yoast SEO already does this. Visiting a /sitemap_index.xml file on a WordPress website will return an HTML sitemap, while hitting CTRL + U to view the source will return the actual XML sitemap.
As previously mentioned, sitemaps can be split into categories, depending on their purpose.
Normal sitemaps: These are by far the most common sitemaps. Pretty much every website out there tends to have one. That’s because most platforms include some sort of sitemap generation system by default.
They are delivered in XML format and can usually be found on the relative path /sitemap.xml.
Most WordPress websites have their sitemap on /sitemap_index.xml. That is the default URL for sitemaps generated by the Yoast SEO plugin.
The sitemap is delivered by Yoast SEO that way with a purpose. XML files can only become so large before it’s unreliable for crawlers to download and read them.
There’s a limit of 50,000 URLs and 50MB for an XML file, but Google limits that to only 10MB so make sure your file doesn’t have more than 50,000 URLs and 10MB.
If you have a particularly large website, you can break your sitemap into multiple smaller sitemaps and use a sitemap index file to manage them.
Search engines will know how to crawl these as long as you provide the right format for your index file.
Then, on each XML sitemap, you can use the regular format mentioned above.
Image sitemaps: Normally, images can be added to a regular XML sitemap. However, if you have a lot of them, it might be a good idea to create a separate XML file only for your images.
More information on how to properly add images to your sitemaps can be found here.
Video sitemaps: You can also add videos to your sitemaps. However, similarly to images, the videos are listed in the sitemap in relation to a page / URL.
If you only have a few pages that contain video, just add that information in the normal sitemap. However, if you have an entire section of your website full with videos, then you might consider splitting them into a separate sitemap.
More info on how to properly include videos in your sitemaps can be found here.
News sitemaps: If you have a news website, then you can specify it in your sitemap. Since Google has a News section, it can really come in handy when quick indexation is a requirement.
More details about how to properly create a news sitemap can be found here.
Last but not least, Sitemaps can be static or dynamic. I would see no purpose in having a static sitemap though, as it would have to be updated simultaneously with the addition of new pages on the website.
If the goal of the sitemap is to let search engines know about new pages, then it should be updated as soon as the pages are published.
This means that you need a dynamic sitemap in order for it to be effective. Keep reading and you’ll learn how to generate a dynamic sitemap for your website.
Are Sitemaps Important for SEO?
First, let’s hear what Matt Cutts has to say about this:
But hold your horses, as you’ll want to prioritize your tasks! There are other much bigger technical SEO issues you should fix before adding a sitemap.
For example, do you have duplicate content on your website? If yes, then you should fix that first. Why? Because Google doesn’t like duplicate content and, by creating and submitting a sitemap, you’re showing it directly to Google.
Sitemaps are not required for search engines to effectively crawl your website.
However, they can come in handy, in particular cases. Since they are listed in the Search Console, it is certain that Google offers them some attention.
A sitemap will be most useful in the following scenarios:
You have a big website: Anything from eComm to big informational websites or news outlets fits here. If your site has a lot of pages, it means it will burn quickly your crawl budget. A sitemap won’t help with the crawl budget, but it can help get some deeper pages indexed faster.
A big website might also mean you make frequent updates. Maybe you post new products a lot and remove old ones. Maybe you are a news outlet. Having your XML sitemap set up properly can ensure the most important pages on your site are crawled and indexed.
Your site has a bad internal linking strategy: If you don’t regularly link in-between your pages, some of them might be hard to crawl by search engines. A sitemap could help here. But again… missing an internal linking structure is a far greater technical SEO issue than missing a sitemap. Search Engines focus on crawling your website naturally first. Even if Google does discover a page through your sitemap, without any links to it no Page Rank will flow to it so it will be considered unimportant.
Your site is new or/and has very little backlinks: Since websites are discovered from link to link, it is essential that other websites link to your site to signal its existence. If no website links to your new blog posts, a sitemap can help search engines quickly discover new pages on your site.
Sitemaps can be also used to hide pages from users while still letting search engines crawl them. I can’t really think of a good example for this… But let’s say you have a product landing page you want to show to search engines with a discount as an incentive to click, but you want to keep it hidden from users which came on your site from direct traffic.
Of course, they would be visible if the user visits the actual sitemap file.
There are things though that don’t matter in sitemaps anymore. For example, change frequency and priority don’t matter. At least that’s what John Mueller says:
We ignore priority in sitemaps.
— 🍌 John 🍌 (@JohnMu) August 17, 2017
In any way, adding a sitemap to your website will not do any harm. But the truth is you might not REALLY need one, or that you have other priorities which can bring bigger SEO benefits.
Or, how Google puts it:
So… if your SEO is perfect and you don’t have anything better to do… let’s add a sitemap to your website.
How to Add a Sitemap to Your Website
First, check if you already have a sitemap! As previously mentioned, it probably lurks somewhere under /sitemap.xml or /sitemap_index.xml. The files could also have the .html extension so check them as well.
If you don’t have a sitemap, you can always create one. The difficulty really depends on what type of platform your website is built on.
How to Add a Sitemap on a Custom Made Website
If it’s a custom made website, adding a sitemap might require your developers to intervene.
Of course, you can just generate a static XML sitemap and upload it to your server. You could even write one yourself, but that would take forever! However, a static sitemap means you would constantly have to generate it every time you add a new page to your website in order for it to be effective.
The developers would have to write some code. The process is pretty straight forward. As a new page/entry is added to the database, an nXML file must be updated with the required information.
If you want to go for the free option, you can generate a sitemap with https://www.xml-sitemaps.com/. The tool will simply crawl your website and structure the information it finds about the URLs in an XML file which you’ll then be able to download to your computer and upload it to your public_html folder on your server.
However, this has its flaws. Firstly, it will be static which means you’ll have to keep regenerating and reuploading it. Secondly, since the tool crawls your site like any other search engine, this means that if your internal linking is bad, the tool won’t find deep pages and thus it won’t add it to the XML file.
Luckily, https://www.xml-sitemaps.com/ also provides paid versions, one that will dynamically add your pages to your website. However, the best option is probably the PHP version, which you can plug into your website and run directly from your website.
Depending on your needs, the solutions above are the best ones for a custom website.
How to add a Sitemap to a popular CMS, such as WordPress
If you’re on a popular platform such as WordPress, you’re in luck! You can solve your issue by installing a Sitemap plugin. On WordPress, the most popular one is Yoast SEO, which generates a search engine optimized sitemap on its own. You won’t have any struggles with it.
Similar plugins/extensions/modules can be found for other platforms, such as Drupal, Joomla or Magento. Simply perform a Google search for “sitemap plugin + your platform” and you’ll find out if something is available.
How to Optimize Your Sitemaps for SEO
Now that you have a Sitemap, it’s time to make sure it’s beneficial for SEO. While Google says a sitemap will never get you in trouble, it actually can, if you do it the wrong way. For example, you might highlight some duplicate content pages which we know cause at least a little bit of issues.
A good way to check your Sitemap for Errors is the Google Search Console (former Webmaster Tools). However, before you submit your Sitemap to Google, you might want to check it with a proper set of SEO Tools. Once you upload it to Google, it will have an impact on your site, be it positive or negative.
The CognitiveSEO Site Audit comes in handy here, as it can solve for you most issues related to Sitemaps. Once you set up your campaign and run the Site Audit, the tool will crawl your entire site and analyze it for errors.
The tool will first highlight the discrepancy between the found page (which the tool was able to crawl, just as Google would) and the pages listed in the sitemap. You might do this on purpose, if you want to exclude certain pages. That’s why it’s a warning in the tool and it’s colored yellow, instead of being colored red, like actual errors are.
Secondly, you want to fix any duplicate and thin content issues on your website. You don’t want to include those in the Sitemap.
You can do this easily with the Site Audit. You can find what you’re looking for under the Content section:
So make sure you exclude those pages from the Sitemap. Even better, you can fix your issues by canonicalizing the duplicates and adding content to your thin pages.
We ‘know’ from John Mueller that the priority tag is an optional tag and doesn’t matter for Google, we can assume search engines and browsers read files from top to bottom, making the information at the top a priority, since they’re read first.
Most sitemaps are structured alphabetically or chronologically as it is simple, but nobody says you can’t structure it the way you want. Consider adding the most important pages first.
A sitemap should be structured hierarchically, similarly to an eCommerce site’s internal linking structure. However, it’s better if you focus on the site structure internal linking strategy instead of the sitemap.
Search engines prefer to crawl your website in a “human way”, which means they will go from link to link on your website until they find all your pages.
If it takes users 1 hour and 100 clicks to get to your important pages and find out what they seek, there’s a big chance Sitemaps won’t fix your problem.
Other XML Sitemap best practices that benefit SEO:
- Consider adding the international hreflang attribute to your sitemaps. You can do it as such:
This will help you get your international pages indexed better as well.
- You can also exclude any unimportant pages which you don’t necessarily want indexed, such as pages with thin content or archive pages and paginated content.
- Make sure to exclude any pages that are blocking Google from crawling them, such as pages blocked in the robots.txt file or by a noindex meta tag. It wouldn’t be nice to invite someone at your house and then not open the door!
- That also goes for 404 pages, canonicals or pages that redirect to other pages via 301.
- Last but not least, although they’re also duplicate content, make sure you don’t add URLs with parameters or anchors in your sitemap (such as comment or social media tracking IDs), unless they are unique URLs with original content.
Once you finish creating and optimizing your sitemap, you can finally add it to the Google Search Console and validate it. There you can also view any previously submitted sitemaps.
The Search Console will let you know if there are any issues, such as duplicate content ones. Luckily, you’ve already fixed those with the CognitiveSEO Tool.
If your website doesn’t get crawled properly, a sitemap should definitely help, considering that all the other possible causes, such as noindex tags, have already been excluded. The sitemaps uploaded here will tell Google to crawl your site, but it’s still up to Google if it will do it or not.
Now you know when it’s important to have a sitemap and how you can properly set one up on your website.
What’s your experience with sitemaps? Have they ever helped you rank better? Have you ever fixed any sitemap related issues and ranked better after? Let us know in the comments section. Also, consider joining our Social Media group on Facebook where you can get more insights on Search Engine Optimization and Digital Marketing.