Don't Be Evil
Since it was started in 1997, Google’s mantra has been “Don’t be evil”, perhaps with a sideways glance at another world-straddling software company that started out with guileless and youthful exuberance, and ended up as the evil empire. And certainly Google’s image is one of benign access, not control. We love the way they change the famous logo to celebrate holidays and significant events, and we all got nostalgic for their tenth birthday and marvelled at the clunky old designs, despite the fact that this pre-history only went back to 1995!
But I’ve been nervous about the fact that Google is buying up online archives, and embarking on their own digitisation programmes. Yes, there is a lot of stuff out there that needs to be preserved, but isn’t that the role of national libraries? I’m worried about the prospect of a private company owning so much of a nation’s heritage: even a non-evil company like Google.
Then the other day I heard about the case of a small newspaper archive site that disappeared after being vacuumed up by the Google machine:
Until about one month ago, this database containing at least 100 newspapers was available for researchers via the paperofrecord.com website and a very good search engine. Then, without any warning or explanation it disappeared or, more accurately, was whisked away by Google to be placed on their newspaper archive website. Information from various sources suggests that Google bought out the owners of the PoR database.
Unfortunately many, perhaps all, of the newspapers previously available cannot now be accessed or even found on the Google website. Google has not been very cooperative in offering advice or explanation as to when the newspapers will be returned for searching and access.
As of today, if you try to access paperofrecord.com, you’ll find yourself at the Google News page.
You can follow the story as it develops here.
Google initially replied that they were working on some technical issues to do with file formats, and that the content of the former Paper of Record site would be available soon. That was on 22 February 2009, but comments of the discussion thread suggest that access to all the newspapers has not be restored as of today (10 March). Caught up in this takeover are community groups, academics, and researchers, including at least one PhD student whose thesis is now stalled because they have lost access to their primary material.
The story got more bizarre when a respondent posted a link to www.worldvitalrecords.com, a genealogy pay site which claimed to archive a number of the newspapers that had only been on Paper of Record. The story continues:
To look at a newspaper, a subscription needs to be paid. This I did. I did a search on the Yukon World, one of the two Yukon newspapers once on the PoR website. My search on a name I knew would be in the paper turned up four hits. Great! I then clicked on one of the finds [sic] to “find out more” and what did I find? I was transferred to the same Google webpage one is taken to when you go to the Paper of Record database. And, just to make sure, I did a search on the advanced search facility on Google and came up with no entries for the Yukon World. Exactly the same outcome since the closure of the PoR website.
OK, apparently the Paper of Record site was a small, privately maintained archive that was built up as someone’s hobby, the result of pioneering work by a small technology company, and there was no guarantee that it would have remained online indefinitely. If national governments are not going to take seriously the preservation of cultural products like this, then organisations like Google will. I am not trying to suggest that Google’s activities in this regard are underhanded or deceitful, and until there is evidence to the contrary I will accept their statement that the delay is caused by “technical issues”.
But Google is fundamentally a profit-making venture, and arguments that they are collecting this data for the benefit of humankind are ridiculous. Eventually these archives are going to be asked to make a profit, and if Google (or any other company) asks for payment for access, there’s not much else a researcher can do but pay.
On a brighter note, the National Library of Australia is taking the digitisation programme seriously, and the early results can be found here.
Update June 2009: It seems most of the content is now available.
Your Comments
Matthew Smith writes:
Google has a conflict of interest when it comes to “Don’t be evil”. They need to make money to survive as a corporation and if dumping a heap of archival content on a digital scrap-heap somewhere is necessary, then they will have to do it. I don’t know if there is such thing as a truly safe archive but I would think organisations that are created for the sole purpose of preservation and dissemination would be ideal. Sometimes we call these organisations “libraries”.
The NLA was/is involved in APSR which is a project we worked on when I was at UQ Library. This was a part of a worldwide effort within the library community to formalise ways of storing digital content in what have become known as digital repositories. Google Scholar indexes many of these repositories already thanks to the efforts of Google and the Open Archives Initiative.
Posted: 10 03 2009 - 12:56 | Permanent link to this comment
Bob Huggins writes:
John—
A couple of things on your post. First I founded Cold North Wind & PaperofRecord.com for anything but a hobby (plse note disgust). Our company created technology that permitted the searching of image files from newspapers. The Toronto Star was the first newspaper to do so in 2001. Our company went on to pioneer the process through many countries including your own.
I too am disappointed at the goings on at Google. I pitched these guys in 2004 the ability to create a database of newspapers representing the past 500 years of historical daily life regardless of country or language.
My hope is they can get their shit together and keep the legacy, aligned with our original vision alive. Time will tell.
Best
Bob Huggins
Founder
Cold North Wind
PaperofRecord.com
Posted: 11 03 2009 - 00:12 | Permanent link to this comment
John writes:
Thanks for the clarification Bob. My knowledge of your company was limited to what I read in the discussion on the Google site, because—for the reasons I outline in the post—I could not find any details online. Please forgive my assumptions about PoR’s history and purpose: I have edited the original post accordingly.
And yes, let’s hope that Google eventually manage to get their act together.
Posted: 11 03 2009 - 02:08 | Permanent link to this comment