Publishing exec: Would Google share its secret sauce, its search algorithm?
June 20, 2007 | 9:32 am
By David Rothman
First Macmillian CEO Richard Charkin merrily “stole” a Google laptop at Book Expo America to make his point about what he regards as unauthorized use of copyrighted books in the Google Library Project.
Now Evan Schnittman, a rights man at Oxford University Press, has come up with other comparisons, including this one:
“Even better, imagine Charkin had gone to one of the thousands of server farms that Google employs and convinced a few to allow him to extract Google’s sacred and vaunted search algorithm from their servers. His promise to them would be that he would only use the algorithm to make his business better at achieving higher search results and he would give a copy of the algorithm back to them so they could archive it. (You need to pretend for a minute that Google’s search algorithm is actually on servers that aren’t in their direct control.)
“Google’s reaction to this would have been swift and severely punitive. Google puts their search algorithm on third-party servers with the understanding that that a server farm will have use and access to the algorithm (Google Intellectual Property) only in the manner that was agreed (keeping the search running, etc). The server farm would not have any rights to the algorithm – even though it came in the server that Google set up at the server farm. Copying the algorithm, even for the seemingly innocent purpose of archiving the algorithm for posterity would not be permissible in any way or form.”
OK, gang. Don’t just read the excerpt. Follow the link and share your thoughts, pro or con, on the Schnittman essay.
The TeleRead take, with an IANAL disclaimer: I can appreciate both sides of the argument, but for the most part would side with Google and the libraries. That said, libraries should not be able to interpret this as meaning, “Oh, good, we need to buy only one copy of a still-copyrighted book—and can share the archived copy with as many users as we want.” A difference exists between archiving and circulating. I’d welcome Google, librarians and publishers enlightening me as to the fate of the archived copies. How much do or will they compete with commercial endeavors? I’d hope very little.
As for indexing with sufficient protection to prevent typical readers from abusing the snippet concept, such a capability will help the book industry by enabling the discovery of otherwise-ignored books. If there are ambiguities, then if anything the law should be changed to clarify this in Google’s favor. See Tim O’Reilly’s arguments–he sits on Google’s publisher advisory committee.
Update, 1:50 p.m.–an email from Evan Schnittman (not posted in the OUP blog): “Like Tim, I too sit on Google’s Publisher Advisory Board… and I agree that we absolutely want Google, MSFT, anyone – able to index and make discoverable all the worlds information – see my two prior ABC of GBS articles for proof. That said, copyright owners need to be asked first.” His earlier posts are here and here. I assume that Google’s reply would be that the permissions process would make the indexing less practical.
Two disclosures that maybe will cancel each other out: (1) For retirement purposes, I own a tiny tiny slice of Google—although, as reader of this blog know, I’ve aggressively bashed Google at times for not presenting books in a PDA-friendly way. I have also complained about Google possibly preempting libraries and gaining too much power over publishers. In terms of my blogging, then, the stock ownership matters squat, but I want to be open about it. (2) While I support Google’s right to index with sufficient precautions in place, I also think that the publishers have a right to do their own indexes and archiving and I actually would applaud the creation of an industry-level archive, ideally in partnership with the Internet Archive (in fact I vaguely recall some discussion of this). An endeavor in which the now-embryonic LibraryCity (in which I’m involved) might play a role?



Previous

SUBSCRIBE TO RSS
Comments:
“That said, libraries should not be able to interpret this as meaning, “Oh, good, we need to buy only one copy of a still-copyrighted book—and can share the archived copy with as many users as we want.” A difference exists between archiving and circulating.”
When I read that statement the first thought that comes to mind is why?
Our society as a whole is undergoing an evolutionary/revolutionary change of one the pivotal factors of our media, time, or more measurably iWhen (yes I am going to burden us with a new i -word).
In 1984 if I want to pay to view my television shows (pay the cable company) but i want to watch them later, I bought a VCR.
In 2007 if I want to do the above I can buy a DVR and pause or record it and watch it when I want. More convenient and its what people seem to want despite it allegedly crushing the advertising stream of tv.
In 1984 if I want to read a specific authors book then I can either buy a copy or go get it at the library. While they may ‘own’ a copy at my library, much like ABC may own a episode of lost, until that ‘owner’ wants to present it to me by either broadcast, or providing me the physical copy of the book, I cant have what I want when iWant. Once I have the book in most cases I must then read it within the time alloted by the ‘owner’ of that item. I have to do both on their times.
In 2007 there is no need for a physical copy of the book. A library could let you check out a digital edition and could allow you in theory to keep that copy as long as you like without penalty. Instead of being locked into reading it right now and within x days, I can check it out and read it whenever and not be negatively impacting other people at the library.
I can see the reverse of this too, if all someone had to do was buy a Sony reader and go to the library and download the stacks then I can understand why the copyright owners would be upset. In that case those book sales may be lost. However as anyone can tell you, libraries have been around for centuries and yet, books still sell.
Is it just me or do all media sources need to re-evaluate their place and their financial liabilities in the iWorld?
Many thanks for your interesting thoughts, Richard. I’d heartily approve of library patrons being able to keep books–through purchases or through a concept mentioned in this blog in the past: permanent checkouts, with fair compensation for publishers who were involved in this business model. Permanet checkouts would address your iWhen concerns. But just let a library buy one copy and spread it to the far corners of the planet? Absolutely not. Such an approach would hurt libraries, among others, by undermining their content sources. Thanks. David
What is the difference between permanent checkout and buying one copy and spreading it to any willing patron? I would take it to be that in a permanent checkout system the library pays an extra fee to the publisher per checkout. What about the patron, would it pay the fee or would it be included, or a mixture (up to x titles a month free…)?
But then why do it through the library, not through some other “pool” of people if the upfront cost is one copy. After all the essential historical purpose of a library is to spread the upfront cost of a book or other type of content like newspapers or nowadays movies to a pool of people that finance said library (whether property owners, taxpayers in general…), so enabling said pool of direct and indirect payers to afford access to vastly more content than individually, subject to some restrictions of convenience. Until now physical restrictions of space and location determined the library model, but for e-books such constraints vanish…
Hi, Liviu. Permant checkout would be limited to one patron (and perhaps family). The extra costs could come from the library budget. Perhaps there would be a quota to keep things under control–X number of books per year. Beyond the quota, there’d would be straight purchases. Since there would be restrictions on the permanent-checkout books, I’d hope that publishers’ charges would be reasonable. By the way, if I recall correctly, OverDrive already lets library patrons burn borrowed audios into nonshared CDs. Thanks. David
Maybe I’m missing something, but what is wrong with simply conceding that the way Google asserts its right to use other people’s intellectual property clearly would make it difficult to for them to oppose disclosure and archiving of their own “algorithm” along the lines proposed here? (I put that in quotes, because I assume that Google’s “algorithm” is not contained in some simple neat formula — to some extent I don’t think its possible to do what the thought experiment requires, but lets set that aside for the moment).
One of the things that is happening here is that the way we interact with information is radically changing, lead in no small part by companies like Google. Since Google’s “algorithm” is simply information, why should it be any different?
@David, I enjoy your blog it provokes thought on rights issues not flame wars.
I guess the point I am getting at is this. The only difference between a library having a physical book to check out to a patron and an electronic one is the limitations of a physical material, it can only be one place at a time.
So if I have patience I can read that authors work for Free. Granted I cant own it but I can consume the content. If you go from physical to virtual, you eliminate the wait and while virtually there are more copies in the end yours simply getting the impatient people who cant wait for the library to get the book electronically instead of purchasing a copy.
I would have no problem whatsoever compensating an author of any work a reasonable amount for their work. While I have never had anything published (I did write a few chapters for a Que book on IE years ago) its my understanding that for the average author the royalties on a book are similar to the royalties a musician sees from a cd. In other words both get a pittance of the price. Less than 10%.
Would I balk at paying $3-6 for a work of fiction like from Koontz (with no drm or tranferrable drm so i could buy new devices in the future and not have to buy a new library) if i knew that 80% or better made it to the author? No!!! I would gladly do it!
Of course all that other money normally goes to advertising and printing the physical copy and to the publisher. Eliminate the publisher and let someone Google for books on the Yeti and who needs advertising? Let Lulu print a hard copy for a fee if someone really wants it. But if people could search for a book by its content then what you’re really talking about here is the death of the advertising and the physical media. Which means the end of publishers who are the strongest opponents of Google’s project.
Most of the debate about this tacitly assumes that book publishers ‘ought’ to stay in business. But there are already more books in existence than anyone can read in a lifetime. Publishers confer no advantage to anyone by adding to what is already a massive oversupply: and we can see this by the desperate ways in which they attempt to block access to older books – copyright suits and single-use restrictions and pulping and ‘dead’ backlists. I’m not talking about non-fiction, where there will always be a demand for new material. But do you really benefit by adding a new novel to the five million you haven’t read already? If I had to choose between a) getting access to all the novels already written, or b) financing publishers’ attempts to issue new ones, I would opt for a) without hesitation. Fix copyright law and let the (fiction) publishers go hang!
Richard: Thanks for your further thoughts. I appreciate your interest in seeing better rewards for writers; I, too, wish royalty rates were higher. Where I’d disagree, apparently, would be in relying so heavily on libraries as promotional vehicles for authors. They are, and I like that. But that model isn’t robust enough for writers or publishers.
Money counts, and I, of all people, should know. My compensation from the TeleBlog is virtually $0 unless you include an unsolicited $250 donation from a kindly, education-minded soul who believes in what I’m doing. I run the TeleBlog because I see the need for it and am compelled to. Miraculously, the blog does find an audience—reaching more people on the Web than a famous library journal does online. What’s more, in sheer numbers on the Web, we crush bookstandard.com even if we’re a long way from nytimes.com. Thanks to Branko Collin, Robert Nagle and others who have helped make this possible.
At the same time, for me, the TeleBlog is as much an exercise in learning, especially from thoughtful commenters like you, as in self-expression.
But I can’t go on forever like this, and you can bet I empathize with the novelists who work without money up front but who hope that at least some rewards may materialize. In fact, I have fiction I’m marketing now. While I’m driven mainly by love of writing, I won’t deny the financial incentive. Copyright exists here in the States to serve the public, not writers, but what better way to than to keep in mind the old Johnson quote: “No man but a blockhead ever wrote, except for money”? That means providing sufficient incentives within reason. So, again, thanks for your appreciation of the M word in regard to writers. Just understand that promo rewards are not enough, as I see it, especially when e-books become the norm.
Both Richard and Jon: Good publishers do add value through intelligent editing (directly or by selecting the right contractors) and promotion (same concept). What’s more, they have backlists they maintain–lovingly, if the publishers are good. Also, remember that not everyone has Jon’s priorities. Fiction, including the modern variety, should matter, and I’d like to think that fiction-minded librarians would have allies in like-minded publishers. They are both from a book-oriented culture, even if not everyone on the Net is.
Within this culture, simply as a recreational reader, not a literary critic, I can see the usefulness of fiction in telling truths that can’t be told through nonfiction. Have you ever heard of Look Homeward, Angel, by Thomas Wolfe—about his small town of Asheville, North Carolina? In this celebrity-minded urban society here in the States, such books are not as much in vogue as they once were, but to me that’s what novels can do, especially in this litigious age when full-truth nonfiction could lead to bankruptcy.
Other readers will have their own Look Homeward, Angels and modern equivalents–for instance, within Jewish-America, Philip Roth. I just can’t obtain the same satisfaction from nonfiction than I can from the best works of writers like Wolfe and Roth. Right now I’m also forwarding to reading Michael Chabon‘s what-if on a Jewish settlement in Alaska. But mostly I prefer to read older writers than Chabon.
At the same time, some in the literati must be shaking their heads in disgust, asking me why I’m not mentioning not just Chabon but also other younger writers, especially Afro-Americans, Asians and Hispanics, including some in their 20s. And I’ll rejoice that they’re doing so. They are demonstrating that fiction is vibrant, that it changes to reflect changes in society at large.
To return to the P word, the best publishers and bookstores are like seismometers and will seek out good novels that reflect the zeitgeist. Granted, they’ll publish and sell dreck to please shareholders. But perhaps that’s as much a reflection on the market—on the school systems here in the States and elsewhere—as on the corporate owners. I think we’re much better off with fiction-loving publishers and bookstores, trying to ferret out the next Thomas Wolfe or Toni Morrison, than simply turning all of publishing into one big YouTube and having the masses vote on everything. Publishing, at its best, especially at small houses, is a form of creative expression just as writing is. I’d hate for posting to replace publishing. Let’s cultivate the best of all the business models, old and new.
Meanwhile, I think this is a most useful dialogue. If nothing else, it should send a warning to publisher’s, particularly of modern fiction, not to take their existence for granted. If they don’t add value and adjust to a increasingly Net-oriented society, they will perish.
Thanks,
David
Addendum, 9:06: I should have mentioned that Thomas Wolfe (not to be confused with the contemporary Tom Wolfe) is the classic example of a writer helped by a connection with the right editor, Maxwell Perkins—chosen by the right publisher, Scribners. Not so coincidentally, Perkins also edited Scott Fitzgerald and Ernest Hemingway. More than chance, eh? Imagine where 20th century U.S. literature would have been without Perkins, who, by the way, thrived on the routine that Scribners made possible.
Brian: Fascinating argument! But I’m not sure if the algorithm/book comparison holds up. Release the algorithms and the rest in full and you’re giving out secrets that could harm Google’s earnings. The book indexing, on the other hand, could actually help publishers’ revenue, at least if archiving isn’t the same as publishing. Usual “I’m not infallible” disclaimer. Thanks. David
The comparison between algorithm/book is nonsense for a simple reason. The book is available somewhere or there is the possibility of making it available again (otherwise the publisher gets nothing anyway), you just have to buy it, borrow it or browse it and you get the content which is its end, its reason of existence.
The publishers do not object to me reading the book, they object to me reading the book without paying them directly or indirectly.
The algorithm is secret, its end is to power Google and there is no way to pay a small sum for it and have it available… Google objects to people reading the algorithm period. Not reading/using it without paying them…
If the comparison would be to commercial software the point would be more valid (though even so there are significant differences), but the secret part is really annoying. Secret books, ha, ha..
If someone finds a way through google to steal (or at least to reconstruct ) an entire book or significant portions of one, would a publisher be entitled to sue for damages?
It seems dangerous for google to accept this liability without receiving any commercial benefit. I’m glad google is doing this; they are within their rights. I guarantee you that the first time they are sued for promoting piracy, they will shut the application/library down. If this feature gave google commercial benefits, then I can understand. But now, the commercial benefits seem to be minimal while the liability risks seem enormous.