imageI’m baffled why Amazon readers are giving just three out of five stars to An Arsonist’s Guide to Writers’ Homes in New England. Not everyone will love the Guide‘s quirks, but I do. A bumbler named Sam Pulsifer accidentally burns down the Emily Dickinson house, killing two and bringing out the inner arsonist in other losers.

National Public Radio called the Guide “captivating”—read an excerpt, to to get a quick feel for Brock Clarke’s blend of charm and mirth—and I agree despite major issues with characterizations.

Many memorable novels abound with flaws, and yet you still might regard them as keepers. I’ve just finished a paper copy from the Alexandria, VA, library and need to get it back before I draw a fine, but oh how I’d love to be able to read the Guide whenever I wanted.

image What, however, if I could legally keep an e-book edition of Guide and other library items I liked, up to a certain number per month or year? And suppose that the quotas favored books over other media, one way to promote literacy? That’s the “permanent checkout” concept, a way to wean libraries off an overdependence on the Rube Goldbergish approach of Digital Rights Management that besets patrons today, especially those with limited technical skills.

I could not sell a permanent checkout of the Guide or other book or video or audio. But perhaps I could share it with my family, and I could at least point friends to the same item for them to access from their libraries—under these terms or others.

A mix of access and financial models, please

Permanent checkouts are just one of a mix of business models and related technologies that I’ll discuss here as a way to help public libraries survive the digital age. Books are my main interest, although the models in one form or another could apply to other content.

  • Access-related models for digital content could encompass permanent checkouts, including books that arrived via e-mail in entirely or in sections; timed browser-based access; a mix of traditional DRM and timed access for patrons willing to put up with the infuriating complexities of “protection”; and unlimited access, made possible through appropriate compensation paid to content creators. First I’ll lay out the problems of DRM, then examine the alternatives.
  • Financial models could include the current approach, of mostly locally support, as well as a TeleRead-style national digital library system with national funding to mitigate the “savage inequalities” of geography and even allow certain books to be released without any limits on their use, if content providers took lump-sum payments or agreed to other terms. Another possibility is a library-adapted collective licensing approach similar to what the Electronic Frontier Foundation has suggested for music. I can also see room for the Creative Commons model, under which, for example, writers could voluntarily make material available for noncommercial use under certain conditions. And if the material is out of copyright and in the public domain, without original content added, or with the consent of providers of added content, then libraries should not have to pay anything. In addition, there is the possibility of books and other library content supported by private or corporate patrons. Finally, consider the possibilities of ad-supported books with proper precautions in place to reduce the possibilities of undue intervention by those underwriting them.

In both the access and the financial areas, libraries should not confine themselves to one approach but rather should make decisions based on the needs of their patrons—within the bounds of the law and common sense. Even if the copyright system let a library system give away complete books online against the wishes of writers and publishers, the system’s supply of commercial content would dry up so that the patrons lost out in the end.

Examples of the advantages of the mixed approach

But why such an emphasis on mixed models in the access and financial areas alike? That is because patrons’ preferences and needs will vary, as will the preferences of content providers and the kinds of material that one model will favor compared to another. For example:

  • Someone who loved an individual book like the Guide could opt for a permanent checkout, knowing that he or she could enjoy just a limited number of items this way. A simple solution exists for obtaining more books to keep: go to a bookstore or download an ad-supposed edition if one exists. To assure both a wide range of content and to protect freedom of expression and private sector innovation, I do not want libraries to replace all bookstores and ad-supported alternatives.
  • A patron with access to municipal WiFi but with limited technical skills might value well-executed, Web browser-based viewing for reading of books at home or in a park—or maybe books by e-mail. On the other hand, someone with more patience with the hassles of DRM might prefer a locally stored file that he or she could read on a PDA away from an Internet connection.
  • Very possibly a well-established publisher or writer would prefer payment by the number of accesses of a title, but a new novelist might be thrilled by the prospect of a $20,000 fee in return for the unfettered distribution of his or her work.
  • Because of the dangers of political interference, a nationally funded library system might not be able to carry the full range of current-events books—although I’d hope it would!—but individual public libraries could route around this through locally originated rights arrangements, as well as judicious use of advertising supported books with quality-control and integrity-related mechanisms in place.

More on access models: Why we badly need alternatives to the current DRMcentric approach, a disaster readers, libraries and vendors alike

The access issue for digital content is far, far more complex than for the physically distributed variety. Only one person at a time can read a paper book, for example, because it is a tangible object, whereas digital items can be shared globally for downloading by millions. For paper books, libraries have come to rely on the model of  limited access with expiration.

Conventional wisdom is that DRM is needed for libraries to make their offering available online. Vendors and many librarians see a mix of DRM and timed expiration as way to vastly expand the number of items available. Libraries either rent access to e-books or buy them with the understanding that there will be access limits—just so many patrons using them at once. Many librarians have trouble envisioning access models other than expiration-based ones, given the parallels with the paper-related models.

Few technologies, however, have drawn as much hatred among borrowers and buyers of books as DRM has; consider the unhappy experiences of a young teacher in Toronto, another Canadian e-book user, and a New York booklover. Many DRM haters would argue against technology being used to create scarcity; still, with unfettered access as the only library model, typical publishers in time would go out of business. I can understand what the publishers are saying. Do we really want a venerable house like Knopf or Oxford University Press to fail eventually, just because it is technically possible for books to go online for free? In areas from editing to promotion to marketing, good publishers can add value, especially for writers who would rather focus on their work than on business details, left to publishers and agents. One way or another, writers need to be paid, and DRM advocates love to point that out.

At the same time, however, I see DRM as both a revenue and literary toxin—no small reason why companies such as Overdrive have not lived up to their original revenue expectations for library e-books, why digital book sales are only 1-2 percent of paper book sales, and why lovers of literature correctly worry about the seriousness of e-books as a medium, given the risks to the permanent accessibility of books if publishers or technology providers fail or if they change their electronic formats. Consider the complications that ensued after the failure of Gemstar’s operation providing DRMed books. Amazon’s brazen herding of customers from the Adobe format into its own Mobipocket format, moreover, followed by similar efforts to play up the Kindle format via bargain-priced bestsellers, is another example of the perils here.

Small wonder that many book buyers shy away from DRMed books, and the same would be true of library goers put off by the need to familiarize themselves with a certain company’s technology—only to find that they must install yet more software to read books with other DRM systems. Yes, if nothing else, DRM adds to the eBabel problem: format clashes that prevent X book from being read with Y software or Z dedicated e-book reader. Proprietary DRM will prevent even nonproprietary formats, such as the International Digital Publishing Forum‘s ePUB standard, from being read.  The same concept could apply to media besides books.

In the area of books, at least, Amazon is hoping to get around the DRM compatibility problem by popularizing its Kindle format and touting it as a seamless, universal solution, so that, for example, readers can download DRMed books without even being aware of the technology’s being in use. The downside is that many readers want to consume books on a variety of devices, such as cellphones—not all of them from Amazon.

As for the IDPF, tech companies could create adopt a standard form of DRM or at least allow for interoperability among proprietary systems. A Catch-22 exists, however. For a standard DRM approach to succeed, say some experts, technical secrets need to be kept—a challenge if many companies are to be involved with a standard. As for DRM interoperability, that still won’t entirely solve the problem since many individual will sill be put off by the inherent technological complexities—created by the use of various kinds of hardware and software. DRM advocates are welcome to challenge me on that. If nothing else, however, I believe that vendors of DRM needs competition from other forms of limited access, so that “protection” experts won’t be smug about usability issues. Libraries owe it to their patrons to encourage this competition.

More on permanent checkouts and other access-related models

I’ll welcome others adding to the list of patron options to be considered:

  • Permanent checkouts, with quotas to avoid libraries’ being considered as bookstore replacements—unlikely anyway in this era of limited library budgets.
  • Books by e-mail that arrive in full or in sections
  • Timed browser-based access.
  • Traditional DRM with timed access—the status quo at many libraries.
  • Unlimited access, globally or perhaps within a certain area based on IP numbers—not a perfect solution but something to consider just the same. This could happen as a result of lump-sum payments by libraries to writers, publishers and other content providers, or perhaps there would be some mind of metering. Some books might be available in more recent or more complete editions through other models. Into this category, by the way, I’ll lump ad-supported books.

Permanent checkouts: It sounds radical, the idea of being able to keep a book, an audio or other library item forever. But at least one major provider of library media, OverDrive, is laudably allowing patrons to burn CDs from downloadable audio files in at least some cases and keep these items forever without time limits. Hats off to OverDrive for  blazing the way for permanent checkouts, even if this is happening by accident. Significantly, once the sound is on CD, patrons can enjoy the audios in a number of formats and, as far as I know, without worrying about DRM. Understandably OverDrive makes it clear that the patrons is not to share or sell the audios.

I’m simply suggesting that we extend the same idea to e-books. From a literacy perspective, the idea holds much promise. Children and others want to make books part of their lives and be able to keep them—without fear of fines: no small reason why some low-income families do not use public libraries. With public domain and Creative Commons books, this can happen, but right now there is no way for public libraries to make this alternative available for books published commercially–the overwhelming majority of pub library books available, not to mention the preference of the typical patron. With the permanent checkout model as one option, vendors could benefit, too—in terms of the popularity of their offerings. Conveniently downloaded e-book files, using the IDPF’s ePub format and perhaps other nonproprietary ones such as ASCII and HTML, would make e-books easy to enjoy on a variety of devices, just so DRM didn’t enter the picture.

Rather than being infested by DRM, permanent checkout books could use social DRM. Coined by an Adobe executive named Bill McCoy, based on the success of The Pragmatic Programmers with this approach for technical books, the term social DRM means embedding names or other information into a book to make it less tempting to post on a peer to peer network. No encryption happens, nothing inherently dependent on a particular operating system or proprietary software. So social DRM can be used on a wide range of books, reducing support costs.

Books by e-mail: This would be a variant of permanent checkout, in that the whole book or installments would still remain on the recipients’ machines. DailyLit is selling e-mailed books that arrive in sections. That would be one way, beyond social DRM, to reduce the possibility of piracy.

image Timed browser-based access: For the many patrons who disliked the complexities of DRM, libraries could provide browser-based access to books on either the institutions’ own servers or those of vendors such as OverDrive, a more likely possibility.

If libraries used straight HTML-based viewing, even modest little PDAs with limited browsers, could work just fine. No DRM would be necessarily in this situation since the patron would be seeing just limited portions of the book at once. Yes, they could stitch together chunks if you wanted. But it would be more trouble than it was worth.

One other possibility, moreover, is the use of a Web-based viewer such as the already-tested EB20 technology used by eBooks.com to help people look over possible purchases—or read already-bought books in full. Click on the image for a more detailed view of EB20 in action. Significantly, Stephen Cole, eBooks.com managing director and founder, tells me that he was able to obtain publishers’ consent for whole books to be able to be displayed this way. What’s more, the technology appears to be popular with the stores customers, judging from what he’s told me—this could be just the ticket to get people into e-books without their bothering to install reader programs.

Ideally libraries could give people a choice of straight HTML-based viewing or viewing with an EB20-style add-on.

The disadvantage of brower-based access, of course, is that you’d need a reliable Internet connection, either at home, at work or via public WiFi. But most reading of library books happens either in the library, where WiFi is often already present, or at home.

Use of traditional DRM with timed access: No, this isn’t my favorite access approach, but it should be an option for those wanting it. The big downside would be the tech support hassles that DRM can lead to. Also, bear in mind that DRM’s typical device limits, which will be more and more burdensome as consumers acquire a variety of devices. Ideally DRM access could happen simultaneously with browser based access or the e-mail variety, so that if a patron ran into technical problems, he or she could immediately switch over to these alternatives.

Business models

Financial models discussed below are:

  • Support mostly from local tax revenue, such as from property taxes–with some state and federal aid.
  • A TeleRead-style national digital library system with national funding to mitigate the “savage inequalities” of geography.
  • A a library-adapted collective licensing approach similar to what the Electronic Frontier Foundation has suggested for music.
  • The Creative Commons model, under which, for example, writers could voluntarily make material available for noncommercial use under certain conditions.
  • Public domain. If the material is out of copyright and in the public domain, without original content added, or with the consent of providers of added content, that libraries should not have to pay anything.
  • Books and other content paid for by private or corporate patrons, whether sponsors or advertisers.

The current approach of mostly local support: We know the big negative—the vast differences in tax revenue available in different locations from property taxes and other sources. One California county some years ago was spending just 25 cents per capita annually on new books and other acquisitions. The big positive is that local and state support is one way, among others, to assure that Washington cannot control our reading habits to accommodate powerful pressure groups with money or disproportionate political power.

A TeleRead-style national digital library system: TeleRead could reduce the inequalities created when libraries rely almost exclusively on local and state funding. For reasons outlined above, I would not want federal tax money to be the sole source of library funding. TeleRead, as noted earlier, could provide for complete purchases of rights from authors and publishers so that their offerings could be distributed and otherwise treated as if the works were public domain. Either lump sum payments could be made, in line with private-sector practices in the 19th and earlier, or else payments related to metering of downloads. TeleRead could also pay writers salary in some cases, such as for the creation of routine educational material or WPA-style guide books. None other than Tracy Chevalier, chair of the 8,500-member Society of Authors in the U.K., has broached the possibility of an academy of salaried salaried writers, as reported by The Times (March 31, 2008). It says she “envisions four possible sources of income…the Government, business, rich patrons and the public,” a good mix as I see it, and a TeleRead-style national digital library system could be a government-related source.

Certainly a TeleRead-style approach would be one way to vastly increase the money now received by content creators, and even if costs reached or exceeded $100 billion a year in the U.S., that would be just a fraction of America’s Gross Domestic Product of more than $13 trillion. The literacy and technological benefits—such as promotion of broadband technology, itself a wealth-expanding phenomenon, given the wider range of online services made possible—would more than make up for the expenditures. One possibility to reduce the impact on the federal budget would be subscription fees that, like income tax rates, increased according to income; the poverty-stricken, in fact, might pay nothing. Perhaps people could even make subscription fee payments through check-offs on tax forms, a way to reduce paperwork. TeleRead, going back to the early 1990s but constantly evolving, through discussions in the informal, grassroots-oriented TeleRead blog, is the topic of the final chapter of Scholarly Publishing: The Electronic Frontier (MIT Press, 1996).

Collective licensing: The EFF’s proposal, if adapted for libraries, could allow libraries along with individuals to pay lump sums in return for usage rights. Perhaps a group representing creators and publishers/distributors would handle the collection and distribution of the payments. The challenge here is to make certain that publishers, studios and other companies do not rig the system to favor themselves at the expense of individual writers, artists and others.

The Creative Commons model: Writers and others would allow libraries to make noncommercial use of material. The problem here is that too much “free” could displace “paid.” While useful as a model to augment others, the Creative Commons one by itself will not suffice. As effective as this model can be at times for promoting traditional commercial books, professional writers still need to have enough of the latter to earn a living and do their best work.

The public domain: The beauty of the public domain is that libraries’ content costs would be next to nothing, and patrons would be able to keep PD works without any encumbered content added. Provisions could even be made, via TeleRead or otherwise, for creation of forewords, annotations and other ancillary content to be created with payments up front.

Support by patrons, sponsors or advertisers: Libraries would be selective in their use of such items. Many questions arise. Do we want libraries turned into billboards, for example? But could low-key credits and not-so-intrusive ads suffice? Among ad-supported items, might libraries focus most of all on those unrelated to the patrons and sponsors’ activities.  Libraries could and should avoid letting an oil company sponsor a book on solar power; a guide to Dickens, however, might be an entirely different question. There might even be a centralized advertising system so that donors  and advertisers did not even know ahead of time which books they would underwrite (although they would have an opportunity to withdraw if they felt uncomfortable with a particular title). With a mix of models in use, influence by advertisers or sponsors would be far, far less of a threat than under the reliance on this one alone. What’s more, bear in mind that libraries already carry ad-supported magazines. The key here for the librarians to favor library users’ needs over those of the benefactors, sponsors and advertisers. If this means less revenue, then so be it.

More on what this multiplicity of models would mean to patrons

Earlier I mentioned the differing needs of library users, in terms of access and financial models. Why, in fact, confine users to any one model in either area?

A patron might even be able to switch models for a single book—progressing in these stages, in cases where he or she had already taken out the allowable number of books under the permanent checkout model:

(1) A look at a first chapter of an entire book or other item via a Web browser.

(2) Checking out the book for a longer period of access via the Web or a DRMed file.

(3) enjoy an ad-supported version of the book.

(4) choose from a list of bookstores to get a version of the book without ads.

To address the general issue of user choice of access and business models, let’s keep in mind S.R. Ranganathan‘s five laws of library science: Books are for use, every reader has his [or her] book, every book its reader, save the time of the User and the library is a growing organism (language picked up from Wikipedia). If libraries can show flexibility toward paper books, shouldn’t they do the same toward the digital variety,?

A common cause of libraries and their traditional adversaries?

I hope that public libraries and content providers alike will consider this mixed approach as one way to make books and other items available for free to the growing number of people who, in the Internet era, are accustomed to not paying for content. Furthermore, many of the same concepts here would apply to academic libraries and others.

Imagine the many tens of billions of dollars that publishing and other industries could enjoy—while the libraries and society as a whole benefited. Perhaps the different sides should spend less time fighting with each other over legislation and more time working together on ways to grow revenue and resources.

Note: This is a living document, and I’d encourage people to make comments or e-mail me with suggestions.

10 COMMENTS

  1. It’s hard to do justice to such an involved post in comments, so I’ll start making specific points in individual posts. However most of my comments are going to be along the lines of “while it sounds good in theory, in practice…”

    So first off – Collective Licensing.

    The EFF proposal calls for this to be voluntary for both the consumer and the publisher. Unfortunately this would fail in practice unless all of the publishers were part of the “voluntary” collective. If there were any content that was not part of the collective licensing agreement, then it would place a burden on the users to determine if the content that they just received or decided to share was covered or not. However, given the sophistication of most end users regarding copyright issues, this just isn’t going to happen. While this may push most hold out publishers to become part of the voluntary collective in order to be compensated (how voluntary is groupthink anyway:) ), the remaining holdouts will end up being a RIAA redux with lawsuits and DMCA takedown notices creating the same nightmare that’s currently happening with music that collective licensing is supposed to fix.

    Also the same preconditions for the Music industry do not hold for the publishing industry. Music compensation is largely homogenous, but unfortunately there is a larger degree of variance in pricing in the book world ( just think large segments – current Fiction, Textbooks, Industry texts/reports for starters).

  2. Social DRM is unfortunately unworkable with current technology.

    Assumptions – the user reason for Social DRM is to have the content non-encrypted (ie regular DRM), so that the user can move content from format to format and machine to machine as the user desires/needs. The Publisher reason for Social DRM is that the watermark will contain sufficient personal information to discourage the user from sharing (acct/credit card/social security etc..) or baring that will have sufficient information to track down the original infringer.

    Unfortunately the current watermarking techniques are fragile and don’t provide publishers sufficient safeguards since they are too easily removed. First the obvious technique of embedding the watermark in the content is that it is easily handled with most editors. There are more advanced steganographic techniques, but these use presentation information of the more complex formats (PDF, Word, etc), but any format transition, either within the format (changing page size), or to a different format will garble/strip the watermark.

  3. Jim, I very much appreciate your raising Issues, which is exactly what I hoped people would do on technical, commercial and legal matters, so the proposal can be as bulletproof as possible.

    On the matter of the EFF plan, keep in mind that (1) it’s just one of several possible models, (2) publishers could decide whether to participate, (3 there could be some kind of plain-text mark to show that content was covered, and (4) Amazon, although admittedly it’s subsidizing the practice, is doing a pretty good job of price standardization at $10 for bestsellers. If enough people publishing industry don’t want the EFF proposal because of the complexities of pricing, etc., there are other ways of avoiding use of DRM for library use—if nothing else, then through timed browsing via the Web.

    As for Social DRM, keep mind it’s already working fine for The Pragmatic Programmer people on technical books. Would it work with other kinds? Libraries and publishers need to find out. Like you, I heartily think we should not just trust “theory.” Publishers could experiment with social DRM in a library context even without digital watermarking used. Remember, too, that OverDrive is already letting people burn CDs for their private use. That’s happening even without Social DRM (not sure about what digital watermarking if any might be present).

    Meanwhile, who knows—perhaps solutions to the technical problems (beyond use of PDF!) can be found.
    Finally there is the possibility of expirable Web access; I could be wrong but you yourself may have brought that up.

    One way or another, library users need alternatives to traditional DRM, which comes with its own problems, as end users keep complaining. It’s such a burden on library and private users that the “Hate DRM” factor might be balancing out any sales preserved through the anti-piracy proofing. Remember, DRM’s problems and potential problems don’t and won’t just result from user inconvenience but also from the eBabel challenges that inclusion of a related “protection” standard for ePub would require. The result could be an expensive delay in spreading around ePub.

    Also keep in mind that, as noted, traditional DRM is bypassable, as shown by the existence of you-know-what software that makes s mockery of Microsoft Reader’s DRM. And yet we don’t see Microsoft Reader books pirated all over the place. Maybe users are more honest than the traditionalists would give them credit. Instead of theorizing, both the library and commercial sides need some Real World tests.

    Keep speaking up, Jim. Although we disagree, I find your comments useful and constructive.

    Thanks,
    David

  4. A quick break from the nitpicking. I apologize that I didn’t state to begin with that I agree with your central thesis that Libraries should experiment with models in with e-content that is not the same as what they are using for p content – the lack of scarcity begs for trying something different. The problem as you have noted is coming up with models that still respect the rights/needs of the copyright holders.

    I also agree that there should be a mix of models as well. Most of what I am pointing out is that there are substantial problems with picking any one model as THE model. It won’t work for all content, but most of the models you propose will work for some content/publishers. For instance the publishers that are comfortable with Social DRM (with no encryption backing) are also going to be comfortable with no DRM. At best current Social DRM only provide a gentle reminder not to the share the book, so to give their content out like that requires that publishers to be in a space where they can trust their users. For various reasons not all publishers are there yet (or will be there for quite some time – advances in the music industry not withstanding).

    BTW The lack of progress of steganographic techniques for watermarking text is not for lack of trying (on my and plenty of other’s part).

    I’ll start working on the response to Permanent Checkout now….

  5. Thanks, Jim—yes, I certainly share your interest in compensation for copyright-holders!

    I myself think social DRM would help in a library situation as a gentle reminder. At times publishers may feel that content will slip into the “free” category, not because users are dishonest but because they may confuse “permanent checkout” with the public domain variety. I know of at least one nonlibrary case where this happened. So, yes, some publishers disliking traditional DRM might enjoy the social DRM option.

    Once again, thanks for pointing out the limits of steganographic techniques for text. I’m just curious if perhaps something could happen through a mix of images and a container format. I even wonder about seemingly random sequences of text inserted in inconspicuous places in a book—perhaps varying from library patron to library patron. Is there a chance that such an approach would work? If nothing else, it would address the problem of reconciling the marketing with the need for reflowability. OK, the problem with that approach?

    Thanks again, Jim.

    David

  6. Actually after reflection, I don’t see much of a problem with the permanent checkout model. The basic model of letting the user choose N books in a time period to “purchase” provides a maximum upper bound and long term testing should determine the actual rate of purchase which should allow publishers to set reasonable rates ( hopefully much less than N*maximum price of a single book) for the content that they wish to offer that way.

    I can see where this would work extremely well for offering back catalog books from publishers. The low rights price of back catalog items shouldn’t drive the price for offering the service to unreasonable level, and as Baen and Tor have proved offering back catalog books (that are a start of a series) for free can drive sales for current items.

    The problems I see with this are:
    – This only works well for content that is homogenous in price range (ie offering fiction and textbooks under the same plan would be problematic).

    – Finding a critical mass, ie getting enough users and publishers too make the system worthwhile to both.

  7. Thanks to David Rothman and Jim Lester for their thoughts. It is great to read a substantive discussion about the future of digital libraries. Here is a quick comment about watermarks. Jim Lester is concerned that changing a book to “a different format will garble/strip the watermark.” However, there is a type of watermark that would remain even after a text format change.

    Suppose that an author who desires a watermark creates a collection of textual variants. For example a text might say “red and fuzzy” or “fuzzy and red”. The sentence “Odysseus and his men become trapped in the cave of Cyclops, a one-eyed monster” might have the following variant “Odysseus and his men become trapped in the cave of a one-eyed monster called Cyclops.” The author should create the variants so that they preserve the meaning of the text and are stylistically acceptable.

    When N binary variants are created it is possible to uniquely identify 2 to the power of N texts. Twenty textual variants could be used to generate and uniquely identify over one million texts (precisely 1,048,576). The watermark is “hidden” in the text because the existence of the variants is not obvious when examining a single text.

    Of course all watermarks are vulnerable to attack. For this style of watermark the adversary would attempt to acquire multiple copies of the text and look for all the differences using a string comparison to try to identify the textual variants. (Technical comment: To try and block this attack a larger pool of variants can be used and an error-correcting-code can be placed on top of the raw data derived from the variants.)

    I thought of this scheme many years ago. It is “obvious” to a “practitioner” computer scientist in my opinion and should be unpatentable. However, many obvious techniques are patented and so I would guess that it has been patented multiple times by now.

  8. Garson, I like the idea. However…

    You would want to encode information to identify the user and not the book, so the information to encode would be either 38 bits for a Credit Card number (assuming that’s available), 160 bits for GUID (assuming a centralized user database), or ~110 bits for a combination retailer GUID (assuming this can be smaller say 80 bits) and user account identifier (also smaller say 32 bits).

    Also, as you noted this would only work for content, where authors/publishers provided the variants so there’s going to be the standard chicken & the egg problem with – building systems for content that doesn’t exist, or creating content for systems that don’t exist… requiring a major player champion to help start creating the ecosystem.

  9. Glad that you found the watermark idea interesting Jim Lester. I do agree that there are substantial obstacles to implementation. Some authors will reject the idea because “every word in their manuscript is already excruciatingly perfect”. There are also general objections that can be raised against watermarks and social DRM:

    1) Electronic items that are given watermarks or social DRM signatures can be electronically stolen/duplicated. This compromises the value of the watermarks and signatures because the extracted data becomes misleading. The data points to the wrong “culprit”. Consider the computer security situation today. A substantial fraction of PCs are compromised with Trojans, viruses and other malware.

    It would be possible for “pirates” to steal e-book, music, and video collections. These purloined items could then be redistributed using P2P applications and other sharing systems. If a copyright holder finds one of his or her works being distributed without permission then he or she can consult the watermark or signature. Unfortunately it might only reveal the identity of the person from whom the object was stolen/duplicated. This does not help the copyright holder very much.

    2) Watermarks and social DRM signatures will be opposed by some privacy advocates. If the mark is embedded in the electronic object and if it identifies the owner or provides credit card data then there is a danger if the object is stolen/duplicated. The credit card data might leak. Data about personal reading habits might leak. If the signature or watermark is resistant to forgery then the object could be uploaded to the net to provide evidence about the scandalous or outré reading preferences of an individual.

    If the watermark or signature is easy to forge then “pirate” sharing networks will probably fill up with duplicated and altered works with designations such as “For the Exclusive Use of Bozo T. Clown”, “Generated for the Private Library of John Q. Public”, “For John Galt – Whoever He Is”, ”Kilroy Was Here – Electronically” or a blank identifier.

    3) If watermarks or signatures are used to trace purchasers or library patrons via a database of transactions or accounts then some privacy advocates will object. Many libraries today erase the data linking a patron to the check out status of an item as soon as the item is returned.

    If libraries kept a persistent database of transactions to help trace patrons then some privacy advocates will object. Book sellers like Amazon no doubt keep persistent databases of transactions but these databases will eventually be attacked too on privacy grounds by some I think.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.