The virtual library

How the publishing industry is stalling Google’s attempt to put all the books in the world on the web.

Sandy Starr

Topics Science & Tech

‘The goal of Google Print is ambitious: to make the full text of all the world’s books searchable by anyone.’ (1)

So says Adam Smith of Google Print, an initiative that aims to index the content of books, make this content searchable online, and – where the book is in the public domain – make it readable online. (It remains to be seen how many people will literally want to ‘read’ a book in its entirety via the web, so the term that Google actually uses to refer to this capability is ‘browsing’.) An important component of this initiative is the Google Library Project, which is currently indexing the book collections of major libraries including the New York Public Library and the libraries of Michigan University, Harvard, Stanford and Oxford (2).

Google has already made it very easy to search existing internet content, while the Internet Archive, a massive project involving the Library of Congress and the Smithsonian, is doing a good job of preserving past internet content for posterity (3). But integrating the content of major libraries into the online environment would be a significant advance.

Google Print is just one of a number of recent high-profile archive projects. There’s the Creative Archive, an initiative set up by the BBC, the British Film Institute, Channel 4 and the Open University, to make print and broadcast content from their vast archives freely available online. There’s the British Library’s project to make over a million pages from nineteenth-century British newspapers available. And there are tentative plans for the European Union to create its own digital library, containing 4.5billion pages of key works from libraries across Europe (4).

Much of this archive work relates to out-of-copyright material, while the Creative Archive makes content available under a specially devised license (5). But Google has hit a snag in seeking to make the content of in-copyright books searchable through its Library Project. While there is no suggestion that in-copyright material will be made freely available to the public, just the fact that Google is scanning this material and making it searchable has caused disquiet in the publishing industry.

Things are made more awkward for Google by the fact that, although its work has a beneficent impact, it is in the unfashionable position of being a successful, profit-making corporation that derives revenues from advertising. Crude attempts by the company to square the circle of profit making while projecting a philanthropic image have only invited criticism (see Giddy over Google, by Sandy Starr).

The Association of American Publishers, the principal US trade association of the publishing industry, objected to Google scanning books in libraries and then using the content in commercial operations. It berated Google for ‘digitally reproducing copyrighted works to support Google’s sale of advertising in connection with its online search business operations without corresponding participation or approval by the copyright holders’ (6).

Google temporarily suspended the scanning of in-copyright books, and introduced a feature whereby publishers could stipulate titles that they did not wish to be scanned. But this was not enough to satisfy the Authors Guild, the largest society of published writers in America, which has now filed a class action suit against Google. The organisation claims that Google’s ‘unauthorised scanning and copying of books through its Google Library programme’ constitutes ‘massive copyright infringement at the expense of the rights of individual writers’ (7).

Google’s vice president of product management, Susan Wojcicki, responded: ‘The use we make of all the books we scan through the Library Project is fully consistent with both the fair use doctrine under US copyright law and the principles underlying copyright law itself.’ (8) The dispute has had technology and law geeks across the blogosphere scrabbling en masse to examine the intricacies of the Google Library Project, in the light of relevant legal decisions.

Parallels have been drawn with being sued and shut down by the music industry in 2000, for offering its users remote access to music, provided that the users had already purchased the relevant CDs. argued that all it was doing was giving its users the ability to listen to legitimately acquired CDs on the move. This cut no ice with the courts, which ruled that had infringed copyright through the unauthorised copying and distribution of music, regardless of whom had purchased or verified what (9).

It is far from clear what the outcome of this latest lawsuit will be. But it’s worth stepping back from the ins and outs of US jurisprudence to examine the broader relevance of the case. Digital technology, which stores and distributes content as discrete, quantised data, has made it possible for anyone with a computer and a connection to procure any form of content – books, music or films – for free.

Books are the easiest form of content to store and distribute, because text takes up substantially less memory than sounds and images. The only hurdle with books, especially those that were published before the advent of word processing, is converting their content to digital format. But this is precisely what Google is now investing considerable resources into doing, and the fact that camera phones are set to become high-precision scanners means that, before long, we could all be doing it (10).

The battles being fought to safeguard copyright, on the twin fronts of technology and law, are rearguard actions by the beleaguered content industries. As the commentator Siva Vaidhyanathan has argued, in light of the fact that ideas and content do not become scarcer as more people share and use them, ‘the fundamental purpose of intellectual property law is to create artificial scarcity’ (11). Whenever new technology highlights an ambiguity in law – such as whether US fair use doctrine permits Google to make in-copyright material searchable by the public, without making this content fully available to the public – industry desperately rushes to resolve the ambiguity in its favour, fearful that artificial scarcity will be replaced by digital abundance.

New technology for publishing and accessing content is always going to disrupt existing intellectual property arrangements and business interests, as has often happened in the past (12). Far from destroying creativity and making it impossible for people to be remunerated for their work – the scenario invoked by those who defend the status quo – such developments have helped to revitalise creative industries that had become too rigid in their thinking and business models.

Better for Google, or anyone for that matter, to test the limitations and question the assumptions of the law now, than wait until its innovation complies with the letter of the law as it stands. Because if you do that, you’ll be waiting forever.

As books go online, the common and over-egged claims that the internet and attendant archives are akin to the ancient Library of Alexandria could become one step closer to reality (13). Of course, an extensive library alone is far from sufficient to make a civilisation great, no matter what great works it contains or how easy it is to access – we need the wherewithal and the spirit of intellectual inquiry to make best use of our library, and to separate the wheat in it from the (ever-mounting) chaff.

But Google’s bid ‘to make the full text of all the world’s books searchable by anyone’ is an important and useful step.

(1) Making books easier to find, Adam M Smith, Google, 11 August 2005

(2) See Library Project – Common Questions, About Google Print, on the Google website

(3) See About the Internet Archive, on the Internet Archive website

(4) See All about the Creative Archive, on the Creative Archive Licence Group website, Old news is good news as substantial newspaper archive is planned for the web, British Library, 10 June 2004, Support for EU ‘digital library’, BBC News, 4 May 2005

(5) See The full licence, on the Creative Archive Licence Group website

(6) Google library project raises serious questions for publishers and authors, Association of American Publishers, 12 August 2005

(7) See Making books easier to find, Adam M Smith, Google, 11 August 2005; Authors Guild sues Google, citing ‘massive copyright infringement, Authors Guild, 20 September 2005

(8) Google Print and the Authors Guild, Susan Wojcicki, Google, 20 September 2005

(9) UMG Recordings, Inc, et al v, Inc, United States District Court, Southern District of New York, 4 May 2000

(10) Camera phones will be high-precision scanners, Duncan Graham-Rowe, New Scientist, 14 September 2005

(11) The Anarchist in the Library: How the Clash Between Freedom and Control is Hacking the Real World and Crashing the System, Siva Vaidhyanathan, Basic Books, 2004, p87

(12) See Google sued, Lawrence Lessig, 22 September 2005

(13) See The Creative Commons, by Sandy Starr

To enquire about republishing spiked’s content, a right to reply or to request a correction, please contact the managing editor, Viv Regan.

Topics Science & Tech


Want to join the conversation?

Only spiked supporters and patrons, who donate regularly to us, can comment on our articles.

Join today