- Duplicate Content: Brief explanation
- Duplicate Content: Detailed explanation
- Why is duplicate content negative?
- What helps prevent duplicate content results?
- Limit duplicate content
- 301 redirects
- Canonical, noindex and hreflang tags
Duplicate Content: Brief explanation
Duplicate Content (also abbreviated to DC) refers to identical web content in the same form at different URLs.
What is Duplicate Content: Detailed explanation
With duplicate content, very similar and completely identical content from different websites are treated as “duplicate content”. Search engines like Google are trying to prevent it, so websites that use too much duplicate content risk consequences. In particular, when there is a suspicion of manipulation for SEO purposes, pages with copied content may suffer loss of ranking, or de-indexation.
Why Is Duplicate Content Negative?
Search engines discourage duplicate content because it provides the user with no added value. And each site must still be crawled and indexed, which wastes resources.
In the past, webmasters often filled sites with duplicate content for SEO purposes, so Google made updates. Algorithm changes such as the Panda update were designed to ensure pages with identical content were ranked lower.
What Helps Prevent Duplicate Content Results?
Consider taking some important measures to avoid its occurrence:
Limit Duplicate Content
Website operators should avoid duplicate content wherever possible and instead produce unique content. On some pages, content is often used redundantly, and occasionally even complete pages have to be duplicated. However, webmasters should limit this as much as possible, and where appropriate, indicate to the search engine that there is another page with the same content, via a link in the HTML code.
In addition to self-generated duplicate content, other sites may also duplicate content – like a website operator using the content of other websites or a webmaster plagiarizing. In the first instance, website operators should require the operator of the other site to mark copied content with a link back to the original content, or insert a noindex tag. Then the search engine can identify which is the original content and which content to index.
If webmasters had content on one page and now needed that same page content available on a different URL, using a 301 code would allow visitors to be redirected to the correct URL site, avoiding duplicate content. In order to make the process user-friendly, webmasters should exclusively redirect to pages that are an adequate replacement for the original.
Canonical, Noindex and Hreflang Tags
When a webmaster owns multiple versions of the URL, such as www and nonwww versions, choosing a canonical, or preferred, domain lets Google know which domain should be indexed and crawled for the site’s pages.
A noindex meta tag is used to block a URL from appearing in search engine results.
To block a site from being crawled, a Robots.txt tag can be used. However, the URL can still be found by crawlers if it is referenced on other sites or if a crawler interprets the directives differently.
The tag hreflang can signal to search engines that a page is the equivalent of a domain in a different language or country. For example, when a domain is available under both .co.uk for the UK as well as under .com for the American market, the hreflang tag signals that this is an offshoot of the other site and prevents the search engine evaluating the pages as duplicate content.
Duplicate content can have negative results on search engines, but website owners have several options available to resolve and prevent duplicate content issues – including clean redirects, source-code tags and unique content.
Join our over 53,000 customers world-wide
and use Textbroker for your Content Marketing.