To Scrape or Not to Scrape

Written By Reprise Media | January 12, 2005 | 3 Comments

While it’s highly unlikely that Daniel Brandt was on Google‘s Christmas card list this holiday season, we’re betting he’s off it now.   The founder of Google Watch – a watchdog organization that keeps an eye out for abuses of Google‘s growing influence over the Web – has posted the source code for an application that allows anyone to [...]

While it’s highly unlikely that Daniel Brandt was on Google‘s Christmas card list this holiday season, we’re betting he’s off it now.

 

The founder of Google Watch – a watchdog organization that keeps an eye out for abuses of Google‘s growing influence over the Web – has posted the source code for an application that allows anyone to scrape and serve Google search results, sans ads. (Brandt first pioneered the technique on his own site Scroogle.org, which has been targeted by Google several times in the past)

 

Brandt raises some interesting points in this post about the purpose and legal ramifications of his actions. His basic point is that Google (and all web search, as a result) should be provided as a free public service, with no economic gain to be had. It’s clear that he’s carefully thought out the repercussions, and is intentionally trying to force Google to debate this issue in the courts. In his own words:


“The larger issue here is that the commercialization of the web became possible only because tens of thousands of noncommercial sites made the web interesting in the first place. All search engines should make a stable, bare-bones, ad-free, easy-to-scrape version of their results available for those who want to set up nonprofit repeaters. Even if it cuts into their ad profits slightly, there’s no easier way to give back some of what they stole from us.“

It’s murky as to who’s right in this situation – To be fair, we’re not quite sure who’s right, especially since we’re not copyright experts. Google has taken publicly available information about the Web, organized and made it more navigable. They’ve patented the tools they use to do so. And they’ve adopted innovative ways of displaying unintrusive advertising that has been widely accepted by users.  Do they have a right to make money off those searches? Sure. But do they also have an obligation to enable others to provide their service for free, given that the information they’re crawling and organizing is all public domain? Not sure. Wonder if Lawrence Lessig is going to chime in on this soon…

 

Google already provides an API into their search results “for non-commercial use only.” And their terms of use explicitly state that the API can not be used to build a product that “competes with products or services offered by Google.” Is it possible for them to hold firm on that position and still claim their “Do No Evil” corporate philosophy?

 

What do you think? - Let’s get some discussion started on this issue…

3 Responses to “To Scrape or Not to Scrape”

  1. Derek Scruggs says:

    Google "gives back " by making it possible to find those "tens of thousands" of sites in the first place. If they don’t like it, they can opt out using robots.txt.Why isn’t Brandt going after Yahoo, Altavista et al? They have big indexes that use advertising as their revenue model, or is he just basing his argument on the amount of traffic they generate?This guy is just a crank.

  2. Karen says:

    Basic search on the web should be equal to search in a library…just there; no fee.

  3. Multi Search says:

    http://z.multiz.com does a good job on scraping search engine results at one. It seems to be a new site but it could have potential.

Leave a Reply