post

6 Ways To Deal With Duplicate Content on WordPress

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

Google’s algorithm updates and penalty has caused a lot of panic, incurred heavy loss of traffic and revenue for websites. While the reasons for penalty are many,  the most common reason for Google penalizing the site could be due to duplicate content issues.

Duplicate content worries is one of the major issue that every webmaster could face, mostly because they are caused due to the complex content structure used in the CMS.

 

How do Search Engines track Duplicate Content?

In the eyes of search engine bots, two or more pages that has a same or largely identical content under different URLs are labelled as duplicate content.  The main function of search engine is to return unique and relevant results, so they don’t appreciate sites with lots of duplicate content. 

Google not only penalizes the single piece of duplicate content on your site, but penalizes the entire website, reducing overall site performance and ranking. Duplicate content doesn’t mean just a copy of the whole content, it can also be a copy of Title, Meta Tags, and other elements. You will also face duplicate content issue if someone copies and republish your content on a different URL.

In most cases, nobody notices the issue until the whole site is penalized.  This post will offer you some advice to deal with duplicate content issue on WordPress.

Dealing with Duplicate Content Issues

 

6 Common Causes of Duplicate Content

If penalized due to duplicate content, you will notice an overall drop in website ranking and not just some pages.

A lower crawl-through rate means bots take more time to find and index newer content if you have a lot of duplicate pages. Search engines only find and index a certain number of pages every time they crawl, so this can result in low crawl rate.

You can use the Google Webmaster Tools to check if you have any duplicate content warnings, analyse the crawl rate and indexing stats (under Diagnostics > Crawl Stats) to figure out how many duplicate pages you have and download the data. This will help save time in figuring out the affected pages.

Below are the most common causes of duplicate content on WordPress blogs. 

  • “#Replytocom=?” links or comment links
  • Pages with same title and meta tag elements
  • Your blog post is copied and republished elsewhere
  • Category, Tag and author and archive pages being indexed
  • Post image attachment link
  • Comment pagination

 

Ways to Fix and Avoid Duplicate Content

Below are the solutions to 6 common causes of duplicate content, and also links to tools and resources you can follow to fix the problem.

  • “Replytocom” Parameter in Links:

Example: Yourwebsite.com/article-title/replytocom=654

This is one of the most common issue with WordPress. In the posts where you have a lot of comments, each comment is assigned a link with comment ID number, and when bots access these URL’s, they find the same content on each of such URL’s.

Search engines don’t know which version(s) to include/exclude from their indices and they don’t know which version(s) to rank for query results; hence, much of these links including the original link is pushed down in search results.

Solution:

Remove “replytocom” links from Google’s index. search site:yoursite.com inurl:replytocom” to check if “replytocom” links from your blog is already indexed in search results. If such links are indexed, you should remove them via Google webmaster tools.

To stop such links from being indexed by search engines in future, install Replytocom redirect WordPress plugin, and also refer to WordPress duplicate content fix.

You could also add the below rule to robots.txt to prevent bots from indexing links with “replytocom” parameter.

Disallow: *?replytocom
  • Duplicate Title and Meta Tags Fix:

It can happen many times that when you have a huge number of posts, you might have some posts with exactly the same title as the another.  To solve this issue, use the Duplicate post remover plugin.

  • Prevent Content Theft

This issue is very common, and copycats don’t spare any good blog from being copied.

Solution

Show only summary in RSS feeds (Settings > reading > For each article in a feed, show: select “summary”), use tools like Copyscape.com, PlagSpotter.com and Dooplee duplicate content checker plugin to find copied posts.

  • Limit Categories, Tag, and Archive Pages

Each tag and category creates a new page and on each such page it shows the title and the first line paragraph from the posts which belongs to each category or tag, this creates confusion to bots when they crawl your site.

Having less tags and choosing fewer very important category is a good practice to avoid duplicate WordPress content. 

For example, let’s make “Colour” as a category and ‘red’, ‘green’, ‘yellow’ as tags to keep it simple.

Duplicate Content Example

By default, the search engine crawls through all the segment of your website and if it finds the same content again, then it is treated as duplicate content. 

Therefore, it is important to specify the search engines about which areas of your blog or website should be ignored. You can set canonical URL’s or NoIndex pages which you don’t want to be indexed by search engines.

For this, you can use NoIndex tag. I also recommend Robots meta plugin, or if you are using SEO plugins like WP SEO, they have this feature inbuilt.

You can also use this code in your ‘header.php’ file. This code makes sure that only pages such as the home, posts, pages and category pages are indexed by search engines spiders, while certain others (feeds, archives, etc.) are excluded. The code is:

if((is_home() && ($paged < 2 )) || is_single() || is_page() || is_category()){ echo ‘<meta name=”robots” content=”index,follow” />'; } else { echo ‘<meta name=”robots” content=”noindex,follow” />'; } ?>

  • Image Attachment Link

This is another issue which creates duplicate pages. When you add an image to a post, you have some link options: 

  • link to file URL
  • link to post
  • link to the image attachment

Never set link attachment as image link.  If you already have a lot of posts with  image attachment linking to image, then you have a  simple solution to fix this issue by installing the attachment pages redirect plugin.

  • Comment Pagination

WordPress 2.7 introduced comment pagination to break comment into different pages in case you have too many comment which drags the page. 

The problem with this is that each broken page means duplicating the content that people are commenting on.  You can fix this commenting pagination  through Settings > Discussions > ‘Uncheck’ Break Comments into Pages option.

Hope this post has helped you to fix duplicate content issue on your blog. Let us know of your experience in dealing with duplicate content issue.

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather
About Sreejesh Suresh

I'm a blogger from Bangalore, I quit my job a started blogging in 2010 with my friend. Since then, it was a great journey and learning experience. I'm currently working on two new blog on mobile apps and gadgets.

Comments

  1. I am surprised, really could these be the reasons of duplicate content? A webmaster really need to be careful of such minor things. Thank you for sharing precious, unique tips.

  2. Thank you for sharing valuable things on avoiding duplicate contents, its very useful and helpful to improve ranking………

Speak Your Mind

*