Navigation on small, static websites is relatively easy.
Facets and filters are often required for larger eCommerce websites, and can prove a bit of a nightmare in terms of the trade-off that is often made between user-experience and SEO.
Ideally all unique content would have its own unique URL, with unique H1 tags and meta titles.
However, when filtering products by price-range, colour, style etc. this can create millions of URLs on large eCommerce sites.
An additional problem is duplicate content: if a user sorts a page by price range for example, the products on the page(s) will remain the same, just in a different order.
Potential Issues with Faceted & Filtered Navigation
- With faceted navigation too many URLs are produced, Googlebot & other search engine spiders may try and crawl all of them; even if there are millions. Then end up leaving prematurely as your ‘crawl budget’ has been used up.
- According to Moz, Noindex and Nofollow tags won’t stop search engine bots from actually crawling any links (and exhausting crawl budget). Equally canonical tags are considered ‘hints’ to search engines and robots will still crawl a page with a canonical tag pointing elsewhere.
- There can be issues defining ‘useful pages’ which you want indexed – for example, products filtered by brand, and ‘not useful’ pages that you don’t want indexed (e.g. highly specific pages such as products sorted by price, style, size and colour). These can also cause duplicate content issues if indexed.
- Useful pages with high search volume, may be missing unique headers and meta information required for optimal SEO.
- URL ordering problems. Sorting by price and then colour, can produce a different URL than sorting by colour and then price. Even though the content will be duplicated (albeit in a different order).
Possible Solutions to The 99 Filter-Problems
In an ideal world, a developer would build ‘from the ground up’ with HTML so that robots can still crawl all the important category pages. Ajax can be used to allow the user to filter and sort pages dynamically (client-side), and prevent robots from crawling every page under the proverbial sun. Ajax filters will not create unique URLs for each filter-option, saving ‘budget’ like Martin Lewis at a car boot sale. Ajax can handle sorting, filtering and to some extent, pagination. With this in mind, if any pages are of high value in terms of SEO and search volume – they may still warrant their own unique URL and content.
Selective Robots.txt File- If a developer is able to define a URL parameter that is created on filtered and/or sorted pages; we can use robots.txt to ensure that these are not crawled.
Use the Hashtag - Search engine robots only look at the URL before the hashtag, so if filtering parameters are added after a hashtag in the URL, they won’t cause duplicate content problems. Alternatively if you use GET-variables (behind the question mark in a URL) you can use WebMasterTools’ URL Parameters Tool (under the ‘Crawl’ section) to configure URLs and dictate which sections such not be crawled. For example, parameters that sort pages, should not be crawled as they contain the same content as unsorted pages, just in a different order.
Checklist for eCommerce sites with Filters & Facets
- Ensure that key value pairs are connected with the equals sign (=), and that multiple parameters are appended with an ampersand (&).
- Ensure values and user generated values, that don’t change page content, such as sessionIDs are not used directly in the URL path, and are placed at the end of the URL (ideally after a hashtag) as a parameter; or consider placing user-generated values in a separate directory and disallow this folder in robots.txt.
- Ensure that important category pages and other valuable pages on your site have their own unique URLs, headers and meta information.
- Ensure that your product descriptions are superior to your competitors. This is something that’s easy to check, effective when implemented but potentially very time consuming.
- Check internal duplicate content levels by using siteliner.com. Then check for external duplicate content.
- Ensure a ‘HTML backup’ to scripted filters.
- Test and retest to ensure navigation can be used easily by visitors and crawled efficiently by robots.
SEO is based on three pillars: crawler access, keyword relevance, and authority. Filters and facets affect the first two of these pillars, access and relevance.