There is still a lot of confusion regarding the duplication of content and the reasons why it is preferable to avoid it, both in order to avoid decriminalizing by search engines, and for not to feed the circulation of misleading information.
Duplicate contents are elements copied from web sources and reproduced in an identical or very similar way among the other resources available on the web.
Excerpts of text or entire sections, images, sidebar, footer and so on, are reused in most cases unconsciously, but often, instead, following an optimization strategy to distort the indexing in search engines in order to obtain maximum online viewing.
We use an excess of keywords, often de-contextualized, telephone numbers and lists of superfluous terms, text files whose size may vary depending on the domain in which you decide to make them appear. This can result in the return, in the search results list, before a site that uses duplicate content and only after that which represents the primary source.
Search engines like the famous Google when requested by the user, propose as a result of the query an index of the available contents that appear in order of relevance according to the key research used.
Ideally, in case of identical pages or containing elements that are too similar to each other, Google makes a selection eliminating, under ranking the results are not relevant and tending to refer to the site that constitutes the source original.
If instead he believes that the duplication was carried out with deceptive purposes and that therefore the practice falls into the spam-engine he may decide to penalize the site by assigning him an uncomfortable position in the results index or even removing it completely.
Google itself explains the main motivation for this line of action: deceptive practices can worsen the service offered to users, who see the same content repeated in a set of search results.
The deterioration of the user experience, constantly looking for original, exhaustive and reliable information, caused by the loss of time necessary to identify the desired elements mixed in a vast field of misleading information and the impossibility of verifying the authenticity of the shared information referring to the first site of origin, often downgraded by the overabundance of duplicate content.
In addition to the risk of incurring the violation of copyright for misappropriation of content, we increase the false information just think of the pull of information or counter information sites that share news and hoaxes without verifying their authenticity and without reporting the sources.
Before you panic thinking about the amount of duplicates that might be on your sites, remember that 25-30% of all content on the web contains duplicate content. Think of quotations of paragraphs or whole texts.
There are also contents boilerplate needed, but the same Matt Cutts, long at the head of the web spam team of Google, to explain that the search engine can detect them and do not count them.
There are several methods to ascertain the presence of duplicate content. The simplest and fastest are the insertion of pieces of text in Google to check how many sites it appears and the use of Copyscape.com, which scans the ranking identifying any duplication.
Creating original and information-rich content, using keywords appropriately and a broad vocabulary, but also sharing sources and reporting hypertext to referring sites remains without doubt the best SEO optimization strategy.