Duplicate Content

Duplicate content - often abbreviated to "DC" in specialist circles - describes the presence of the same - predominantly textual - content on different URLs. Duplicate content" can occur both within a website under the same domain and on different, independent websites.

Duplicate content is recognized by search engine operators and evaluated negatively in the rankings accordingly. The search engine operators assume intentional duplicate content in order to improve their keyword ranking and the findability of the website. A typical source of duplicate content is missing/incorrect redirects from http to https URLs. Another source of duplicate content on your own website is missing canonical tags. By the way, unique content is the opposite of duplicate content.

 

How is duplicate content created?

Duplicate content is always the term used when larger blocks of text are repeated either on the same domain or on different domains, or are at least largely similar. DC can occur both internally and externally. Internal DC text blocks can occur es , for example, when a text post appears in several subcategories and thus different URLs in a content management system such as WordPress (search by keywords, categories or publication date). This is a common case of internal DC. External DC occurs es among other things, if journalistic articles appear in the same wording on several websites in the context of cooperations. But simple text theft also leads to DC. This may even intentionally lead to the original being penalized by the search engines.

 

Detect & fix duplicate content

Es there are a number of ways to locate DC. You can already do this manually by searching for concise text snippets or sentence structures in Google Search. If you put these text snippets in quotation marks in the search box, the search will look for exactly this wording. If several hits then appear in the search results, this is a classic case of DC. Google already recognizes DC on its own and in this case indicates in the search results that some entries have been omitted because several search hits are very similar.

In the area of technical SEO, an onsite audit reveals duplicate content. The first signs of internal duplicate content are identical headings or identical meta tags.

To find DC content, other free tools are available that can be used to perform a duplicate content check online. However, you have to be careful with the results that these tools sometimes also find small text snippets like teasers or sub-headings, which generally do not pose any problems. 

Internal DC can be avoided by providing unique content on each page. Copying text blocks to a landing page is prohibited. If duplicate content cannot be avoided, Canonical Tags can help. This practice is often used for catalog products in different versions. For example, if you sell T-shirts in different colors, es always gives one original product (e.g. color white). All product variants (blue, green, red) are then duplicates. A note in the source code indicates to Google that es is deliberately duplicate.

DC often occurs unintentionally when a website is accessible under both "website.de" and "www.website.de". This can be easily prevented by a redirection, which ensures that a call with "www" is redirected to the domain without "www". Here at es it is important that the HTTP status code 301 is returned when "www.website.de" is called up, so that the search engine bots can recognize the redirection as such.