Metadata: “Google Books is at Heart a Catalogue of Errors”
December 8, 2011 | 9:34 am
The article looks at Professor Geoffery Nunberg’s 2009 comments about the poor quality of Google Books metadata and includes new comments by Nunberg about what he sees today.
In response to Professor Nunberg’s critique, Google offered to correct any errors that were brought to its attention. But while this process has ironed out specific glitches in the intervening years, Professor Nunberg does not believe it has made a fundamental difference.
“The changes are a drop in a greatly enlarged ocean,” he said, adding that the flaws in Google’s metadata remain “a big systematic structural problem”.
In the course of his research alone, he has continued to come across glaring errors similar to those he flagged up two years ago.
Professor Nunberg said he could not understand why Google scans in copies of books from major research libraries, where the details tend to be recorded correctly, and then turns for its metadata to far less reliable sources.
To patch up the huge problems would now require substantial time and resources. These were unlikely to be forthcoming, Professor Nunberg said, because, “like most high-tech companies, Google puts a much higher premium on innovation than maintenance. They aren’t good at the punctilious, anal-retentive sort of work librarians are used to.”
Nunbergs’s final comments are right on the money and not only because many catalogers do great work. We’ve seen Google shutdown several tools and programs that info pros and educators found useful during 2011. They might NOT have been the most used services but that doesn’t mean they weren’t useful to those who used them. So the question is, does Google see any need to do any sort of massive overhaul of book metadata to appeal to a relatively small group of users who might take full advantage of it. In other words, do most users Of course, the metadata can play a role in retrieval and that could lead to sales. On the other hand, 1) would most users (enough to make Google motivated to care) care if they miss a book or two and 2) millions of titles are public domain so there is little to no chance of it playing a role in the sale of the book?