Apparently, the rules are different for the "don't-be-evil" empire. Google says not to scrape content from other sites. It's bad, and it doesn't serve your visitors. Seems like a peculiar stance for a company that scrapes with the audacity Google does.
Purely scraped content, even from high quality sources, does not provide any added value to your users. It’s worthwhile to take the time to create original content that sets your site apart. This will keep your visitors coming back and will provide useful search results.
This from a recent post on Google’s Webmaster blog, ”Site content and use of web catalogues.”
Overall, it makes perfect sense, but it’s hypocrytical because Google, itself, is the biggest “scraper” in the world. Forget search, this is embarrasing posture given Google News, alone.
Granted, I’m probably stretching the context here. Google, after all, provides links to the sources it scrapes from. But the fact remains, scraping content is not only very popular, it’s the basic building block for every major search engine, portal or Web 2.0 social-whatever-site like a Digg. Sure, sometimes it’s done manually by an individual, but it’s effectively the same.
Certainly, it’s not cool to scrape without citation. But with citaton and linkage, scraping will probably work the opposite way Google suggests. Your visitors will keep coming back, which is good because they’re the ones that may actually buy whatever it is you’re selling - not Google.
Posted by Todd

