Analyst of Google Webmaster Trends, John Mueller brought forth a warning through Twitter regarding the growing issues of duplicate content that he is seeing. In his warning message, he has also made it clear about all those content which don’t qualify to be called duplicate or ‘copied’. Read on the rest of the article to know more on duplicate content, their traits and how you can steer clear from them to boost your search engine rankings.
Slash on Hostname/Root
John Mueller said that the slashes on hostnames or roots usually are not taken into account. This implies that it is not an issue if there is one forward slash right at the end of the name of the domain or if there’s not. Both are equal things. You can henceforth incidentally have your website’s homepage linked to the site as abcd.com and abcd.com and Google will never differentiate or categorize both as an issue about duplicate content. So, you don’t require worrying about this issue any longer.
Slash at the end of files are considered duplicate
This is something that you necessarily need to understand. If there are file names either with a forward slash or without one, it will definitely be deemed to be as ‘duplicate’. In case your specific webpage can be directly obtained by abcd.com/shoes and abcd.com/shoes/, you will fall into the copied content issue. In case the actual URL is /shoes/ then the server will possibly redirect /shoes to /shoes/.
Disparate protocols will also be a thing of concern
This is when the issue of copied content gets real. Mueller opines that Google will consider both the same page as 2 separate pages if you make the error of writing the same URL with a distinct protocol. For instance, abcd.com will be considered as a different page from abcd.com. If you have 301 redirects which can handle this matter, you will be probably fine. But in case you don’t, Google may consider this as an issue. This could become a huge issue in the long run.
Can a competitor puzzle Google?
There are some servers which still deem a webpage as HTTPS even though you might not have obtained a security certificate. This will be seen as a copied webpage by the Google authorities. A competitor will just need to link to your website with https to bring Google indexing a copied web page.
On the other hand, there are few servers which will not serve non-SSL webpage ever through HTTPS if there are no redirects to tackle such a request. Henceforth if your non-SSL site doesn’t get redirects to handle requests for HTTP version, a competitor will find it easily possible to design links for the version that isn’t existing, the HTTPS one. Google might consider that as a different page.
Safeguard yourself from issues regarding duplicate content
#1: Set a canonical tag
For each page, you can add a canonical page. This will prompt Google for the URL version which is correct. Though Google is not liable to obey it, but it will definitely consider it as a hint.
#2: Test the response of your server
You may require adding 301 redirects to compensate for copied URL or errors related to ‘site is down’.
#3: Audit URLs
Crawl through your webpage by leveraging Screaming Frog and check the URLs for any ‘page not found’ or duplicate issues.
#4: Look for 404 errors
Check the traffic analytics, server logs and track sources of 404 page. These 404 errors should always be enquired about.
So, now that you know the ways in which you can deal with duplicate content issues, work accordingly to stay ahead of your competitors.