Quote:
Originally Posted by webcosmo
Thats a very good question. I don't know the answer actually.
Can anybody expert answer this?
|
Matt Cutts (probably with the best answer we're likely to get):
"Q: When does Google detect duplicate content, and within which range will duplicate be duplicate?
A: Good question. That’s not a simple answer... the short answer is, we do a lot of duplicate content detection. It’s not like there’s one stage where we say, OK, right here is where we detect the duplicates. Rather, it’s all the way from the crawl, through the indexing, through the scoring, until finally just milliseconds before you answer things.
And there are different types of duplicate content. There’s certainly exact duplicate detection. So if one page looks exactly the same as another page, that can be quite helpful, but at the same time it’s not the case that pages are always exactly the same. And so we also detect near duplicates, and we use a lot of sophisticated logic to do that.
In general, if you think you might be having problems, your best guess is probably to make sure your pages are quite different from each other, because we do do a lot of different duplicate detection... to crawl less, and to provide better results and more diversity."
From:
http://blogoscoped.com/archive/2006-08-02-n60.html