One of the main ingredients to building and maintaining a Web site that ranks well for search engines is avoiding duplicate content issues. Duplicate content from plagiarism can be a real problem for Web site owners, and it can ruin a websites reputation. Google’s explanation of duplicate content is: “Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”
Unfortunately there is a ranking war out there, so there are many people that are committing plagiarism to work their way up the rankings faster. One of the problems is that people steal content from high ranking Web sites, which in turn is stealing from the success of that website and their ideas.
Here is what Google says on the subject of Duplicate Content! “In some cases, content is deliberately duplicated across domains in an attempt to manipulate search engine rankings or win more traffic. Deceptive practices like this can result in a poor user experience, when a visitor sees substantially the same content repeated within a set of search results.”
“Google tries hard to index and show pages with distinct information. In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we’ll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results.”
My philosophy is… whether sites are being penalized or not for duplicate content, you must remain vigilant because someone may very well attempt to poach your content.
If you happen to discover a person that has stolen your content, and placed it on their site without your permission, you should first contact the person and demand that it be removed at once. If this doesn’t work, then you will need to contact the hosting service for their site. You can find this information on the Whois Source database. Contacting the hosting service may also alert the search engines and make them aware of the problem. This may also get the content thief kicked off the search engine indexing all together.
Fortunately there are tools that exist that can help you search for replicated content. One of the most commonly used tools is Copyscape. This is a free service that searches for similar or identical content, then reports it to you. Copyscape also offers a free plagiarism warning banner that you can display on your Web site to deter others from stealing your work. There is also the premium membership with Copyscape that gives you unlimited searches for copies of your Web pages, and also tracks acts of plagiarism.
Another great plagiarism detector is CopyGator. CopyGator is a free service that is designed to monitor your RSS feed and find where your content has been republished in cyberspace. They will notify you when a new post of yours was copied to another feed, plus CopyGator has built a page that you can view to see where and when your content was duplicated. CopyGator will also provide you with a badge that you can place on your blog that will find the feeds to your site and watch your content for duplication. When the badge has turned RED, it means that your content has been duplicated.
Closing Comments about Duplicate Content and Plagiarism!
As a website owner, you need to understand that you have the rights to protect your website content. Content duplication is wrong and illegal to use without authorization, and as such you should stand up for yourself because this is the only way to put an end to it.
There is another issue known as URL Canonicalization you need to be aware of. A URL canonical issue is the presence of multiple URL’s that all return the same Web page, which can also lead to duplicate content issues. Please read the article if you are not familiar with the canonical URL issue, to bring yourself up to speed… now is the time to make any changes before you get in too deep.