imageYesterday, Nate broke quite a story over on The Digital Reader about Adobe Digital Editions 4.0 sending information in the clear about the e-books you read. It got picked up by The Passive Voice, Ars Technica, GigaOm, TechDirt, Slashdot, BoingBoing, the list goes on and on. Congrats on the scoop, Nate! (Frankly, I’m amazed his blog is still up, given all the traffic this has to be sending his way.) He posted another story today indicating that the bug doesn’t seem to affect prior versions.

Effectively, ADE 4.0 gathers up a bunch of information on the books you open with it and sends it right off to Adobe. I installed ADE 4.0 and the program Nate used to test it, and with a little fiddling I was able to see for myself. The output is ugly to the naked eye, as in this line concerning my Calibre reformat of Project Gutenberg’s Barchester Towers:

"id":"com.adobe.rmsdk.nocert.dewin","h":"bc49804211df305958d6baa99622d2ead02b31e0c94a6ae22e47038db46d8d1a","d":[{"msg_DocumentCreated":{"Document Instance Created":{"atTime":1412715576535,"title":"Barchester Towers"}}},{"msg_LicenseDataCaptured":{"License Data Captured":{"atTime":1412715576537,"userID":"","operatorURL":"","licenseURL":"","distributorID":"","resourceID":"","fulfillmentID":""}}},{"msg_DocumentScanned":{"Document Scanned":{"atTime":1412715576539,"Title":"Barchester Towers","Creator":"Anthony Trollope","Subject":"Fiction","Description":"","Publisher":"B. Tauchnitz","Contributor":"calibre (0.7.48) []","Date":"1859-03-15T07:27:41

But you can see the information is there. And it’s sent in the clear. Apparently it also sends information on the contents of your library, including the e-books you aren’t reading right now. Though I didn’t find anything of that nature in my brief glance and neither did GigaOm, apparently Ars Technica did. (I’ll go over how to see for yourself at the end of this post.) It seems to have something to do with Adobe’s process for verifying DRMed books, such as purchased or library books, though that doesn’t explain why it gathers information about even the DRM-free ones.

Ars also has a brief statement from an Adobe spokesperson:

Adobe Digital Editions allows users to view and manage eBooks and other digital publications across their preferred reading devices—whether they purchase or borrow them. All information collected from the user is collected solely for purposes such as license validation and to facilitate the implementation of different licensing models by publishers. Additionally, this information is solely collected for the eBook currently being read by the user and not for any other eBook in the user’s library or read/available in any other reader. User privacy is very important to Adobe, and all data collection in Adobe Digital Editions is in line with the end user license agreement and the Adobe Privacy Policy.

The statement effectively reads like the kind of boilerplate provided to low-level peons who don’t have any actual knowledge or authority but have to say something so people know they’re aware of the issue. (Remember how Amazon’s first response when someone self-published a pedophilia manual was to issue a statement about how “Amazon believes it is censorship not to sell certain books simply because we or others believe their message is objectionable” before finally pulling the book a few hours later when someone with authority actually looked at it? Same kind of thing.) The idea that sending information about what you’re reading in the clear is in line with any information-age privacy policy is ludicrous.

Needless to say, this is causing quite the uproar, and with good reason. Since the information is sent in the clear to Adobe, not only does Adobe get it but anyone sniffing packets from your WiFi could as well. Quite aside from the expected issues of people not wanting anyone else to know what they’re reading, it could lead to breaches in confidentiality agreements and all sorts of problems like that. It could provide a handy record of what library e-books you check out. And people who crack the DRM on their e-books to back them up might have to worry about potentially being exposed as law-breakers.

Even if the information isn’t intercepted in the clear, the fact that Adobe has it means they could potentially be subpoenaed for it down the road. Even if they discard the information as soon as it’s received, they could be required to start collecting it with the digital equivalent of a “pen trap.” The whole parade of horribles that librarians have been fighting for their book checkout records could come into play here as well.

As Nate points out, this may even be a violation of privacy laws. One thing’s for sure: this is clearly not another case of a security software false positive. I’m glad I’ve been using ADE 2.0 up to now for my ADEPT DRM e-reading needs.

How to See for Yourself

The process of independently verifying the leak is fairly simple, though a bit arcane. Anyone can do it and verify for themselves what’s being sent (until and unless Adobe patches over it, anyway), which is part of what makes this revelation so powerful. Here’s how.

  1. Install Adobe Digital Editions 4.0 if you don’t already have it. Don’t worry if you’re still using an older version; it installs into a new directory as a completely new program rather than overwriting the old version. You can uninstall it afterward without affecting your older ADE installation.
  2. Install Wireshark. It’s a free download.
  3. Launch Wireshark and start recording. On the launch screen there’s a green shark fin icon and the word “Start” next to it; there’s also a green shark fin icon in the graphical toolbar at the top.
  4. Launch Adobe Digital Editions 4.0 and open an e-book or several. Use one from Project Gutenberg or the Baen Free Library if you like. Mess around for a while to give it the chance to capture lots of data.
  5. Hit the “Stop” button on Wireshark. It’s the reddish-brown square in the toolbar. Once the capture is stopped, you’ll be able to sort the captured packets and view them.
  6. Find the packets with destination The simplest way to do this is click on the “Destination” header to sort ascending or descending, or right-click on the header and choose your sort. Then scroll up or down the list until you get there.
  7. Find the HTTP POST packet. The packets will be arranged in a series of several TCP packets that say “[TCP segment of a reassembled PDU]” in the Info field, followed by a single HTTP packet that begins with “POST /datacollector”.
  8. Right-click on that HTTP packet and choose “Follow TCP stream,” about 2/3 of the way down the context menu. That will pop up another window (the one pictured at top right in this article) that will show you exactly what data was sent.

Pretty scary, huh?


The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail