Duplicate content in wordpress? I cast thee from my site!
So where does it come from?
Well, the first source of duplicate content will come from you not putting the proper items in your robots.txt.
Here’s what my robots.txt file looks like:
You’ll notice that I stopped the bots from crawling in all of my plugins folder. I could have simply used Disallow: /wp-content/ – unfortunately, doing so would have led to none of the translated pages getting indexed since the translation icons at the bottom of this page come from a plugin that resides in /wp-content/plugins/global-translator.
Anyhow, make sure you don’t block anything that you are using inside of your pages with direct links to it like I mentioned and you’ll be fine. Then you’ll block the rest of the includes, category, feed, page, and comments folders.
It’s all precisely designed to keep the duplicate content from your pages!
The second source of duplicate content from wordpress is archive pages. This one was a bit trickier to solve. To do so I had to modify my wordpress template. This isn’t for the faint of heart!
The code to do what I’ve done on my archive pages where it only lists the title looks like this: