Don't Be Evil
Since it was started in 1997, Google’s mantra has been “Don’t be evil”, perhaps with a sideways glance at another world-straddling software company that started out with guileless and youthful exuberance, and ended up as the evil empire. And certainly Google’s image is one of benign access, not control. We love the way they change the famous logo to celebrate holidays and significant events, and we all got nostalgic for their tenth birthday and marvelled at the clunky old designs, despite the fact that this pre-history only went back to 1995!
But I’ve been nervous about the fact that Google is buying up online archives, and embarking on their own digitisation programmes. Yes, there is a lot of stuff out there that needs to be preserved, but isn’t that the role of national libraries? I’m worried about the prospect of a private company owning so much of a nation’s heritage: even a non-evil company like Google.
Then the other day I heard about the case of a small newspaper archive site that disappeared after being vacuumed up by the Google machine:
Until about one month ago, this database containing at least 100 newspapers was available for researchers via the paperofrecord.com website and a very good search engine. Then, without any warning or explanation it disappeared or, more accurately, was whisked away by Google to be placed on their newspaper archive website. Information from various sources suggests that Google bought out the owners of the PoR database.
Unfortunately many, perhaps all, of the newspapers previously available cannot now be accessed or even found on the Google website. Google has not been very cooperative in offering advice or explanation as to when the newspapers will be returned for searching and access.
As of today, if you try to access paperofrecord.com, you’ll find yourself at the Google News page.
You can follow the story as it develops here.
Google initially replied that they were working on some technical issues to do with file formats, and that the content of the former Paper of Record site would be available soon. That was on 22 February 2009, but comments of the discussion thread suggest that access to all the newspapers has not be restored as of today (10 March). Caught up in this takeover are community groups, academics, and researchers, including at least one PhD student whose thesis is now stalled because they have lost access to their primary material.
The story got more bizarre when a respondent posted a link to www.worldvitalrecords.com, a genealogy pay site which claimed to archive a number of the newspapers that had only been on Paper of Record. The story continues:
To look at a newspaper, a subscription needs to be paid. This I did. I did a search on the Yukon World, one of the two Yukon newspapers once on the PoR website. My search on a name I knew would be in the paper turned up four hits. Great! I then clicked on one of the finds [sic] to “find out more” and what did I find? I was transferred to the same Google webpage one is taken to when you go to the Paper of Record database. And, just to make sure, I did a search on the advanced search facility on Google and came up with no entries for the Yukon World. Exactly the same outcome since the closure of the PoR website.
OK, apparently the Paper of Record site was
a small, privately maintained archive that was built up as someone’s hobby, the result of pioneering work by a small technology company, and there was no guarantee that it would have remained online indefinitely. If national governments are not going to take seriously the preservation of cultural products like this, then organisations like Google will. I am not trying to suggest that Google’s activities in this regard are underhanded or deceitful, and until there is evidence to the contrary I will accept their statement that the delay is caused by “technical issues”.
But Google is fundamentally a profit-making venture, and arguments that they are collecting this data for the benefit of humankind are ridiculous. Eventually these archives are going to be asked to make a profit, and if Google (or any other company) asks for payment for access, there’s not much else a researcher can do but pay.
Update June 2009: It seems most of the content is now available.