Dealing with duplicate content

Apr 16, 2021 by
Dealing with duplicate content

There are only so many words and letters and only so many possible combinations of those words and letters. With the literal billions of web pages now populating the Internet, it is only natural that duplicate content would occur after a time.

However, duplicate content is a problem that affects countless websites of all shapes and sizes and it could be robbing you of clicks and conversions.

What is duplicate content?

We’re not just talking about identical content here. Because of the way Google crawls the internet to index content, even if the content is a little too similar, it might be read as a duplicate.

Remember, this isn’t just about what the users are actually seeing either, this is about the source code that search engines are reading. There is an obvious problem here as if any content is read as being a duplicate it won’t rank as well. Indeed, if it is 90% the same or more it will generally be flagged.

How to deal with duplicate content

Duplicate content is rarely, if ever, a problem that occurs on purpose. It’s more than likely your code is going to be similar enough to be flagged through no fault of your own. For example, if you create a new version of a webpage or are listing multiple items of a similar ilk. But how do you go about solving this problem?

There are a few different methods to try here.

301 redirects

This is essentially a means of redirecting users from the older version of a page to the newer version of a page. A 301 redirect is a vital move to make if you’re making subdomain or protocol changes or if you have made significant changes to a page and don’t want people seeing the older version. Think of it as transferring your users from DVD to BluRay.

Rel-canonicals

This is a great option for sites where there are two versions of a product or service for sale that are similar enough to confuse web crawlers. The canonical tag lets the crawler know which is the most relevant link for the search engine to link to even though there might be another page with content that’s 90% identical. For example, say you’re a company that sells vehicle parts for different models of car and the only thing that’s different in each listing is the case model. This would be the ideal fix.

Meta noindex

Marking pages as meta noindex allows you to tag the most relevant and recent content without losing access to the original. They tell a crawler that the original page is still there but it shouldn’t be indexed. This way, you can still access the old page but your average user will only see the bright and shiny new one.

Adding more content

Finally, what if you have two pages that are completely different in terms of content but are still being read as duplicates by a crawler? It’s rare, but it does unfortunately happen.  In this case, perhaps the most obvious practical solution is simply to add more content to each page until they are different enough to be read as completely separate entities.

Add more information and be as specific as possible and you should be able to solve the problem in minutes. Good luck!