|
© 2003-2007 Googlerankings.com
Googlerankings.com is in no way affiliated with or is the property of Google Inc. |
| Introduction About this guide Common issues Duplicate content in Google
Tools and services |
Due to the filters applied to battle off spam and scraper web sites, duplicate content has become a major issue. The filter for duplicate content is applied to URLs that serve up the same web page content under different addresses, thus are filtering out potential cases of plagiarism or repetitive pages. See more information on Duplicate content. In order for a web page not only be, but also be perceived as the only copy with its content, the proper server settings, internal navigation and inbound links are necessary. The Canonical URL is the URL that is thus set as the only URL to be able to serve that particular web page. In other words, it is the preferred URL for a single web page. Also, choosing a single Canonical URL to be used for each web page will help concentrate all incoming references, and accumulate all parameters such as PageRank in a more effective way. Known issues Sometimes a single web page with no additional copy of it existing on its server can still be perceived by the algorithm as the duplicate of another. This may be the result of not choosing or not setting up the www. subdomain preference on the server, or in the Google Webmaster tools panel, leaving the same web page displayed for more than one kind of parameter sets with dynamic queries or having directory index files linked to, both by their full, file level and shortened, directory level URLs that will default to the index files. ( For example in some cases the very same web page could be accessed through the following URLs : www.example.com/index.html , example.com/ , example.com/index.html, www.example.com/ , or in another example: www.example.com/product.php?item=10&action=review , www.example.com/product.php?item=10 , www.example.com/product.php?action=review&product=10 ... etc. ) + Resolution: Make sure that a single
web page can only be access through a single valid URL. Correct the navigation
of the web site so that a single page is always linked to in the same
manner, using the correct parameters with the URLs so that the same content
( for example database requests ) can not be accessed and served with
more than one set of add-on strings, excluding variations such as different
order in which the parameters are included in the URL. Also check whether
you are relying on the server setup for web pages that are shown as default
for directory level URLs. Make sure that such pages are referred to in
the same way throughout the web site navigation, and that no inbound links
are pointing to the other version either. You should also see to it that
your server is set up properly for cases you can't avoid any of the above,
and set up permanent redirects to correct the problem. Using 301, Permanent
redirects in a .htaccess file should allow the correction of already existing
duplicate URL entries and also prevent Google from indexing the same page
on a different address. Also keep SSL protocol in mind, an http and an
https version to the same page is also seen as a duplicate.
Resources What's a preferred domain ? ( Google Webmaster Help Center ) http://www.google.com/support/webmasters/bin/answer.py?answer=44231 SEO advice: url canonicalization ( Matt Cutts: Gadgets, Google,
and SEO ) Preferred Domain - www vs. no w's ( Google Groups: Webmaster Help
) Cleaning Up Canonical URLs With Redirects ( Social Patterns ) Why Does Google Treat "www" & "no-www" As
Different? ( Webmasterworld ) |
Web site diagnostics Banned from Google
|