opposite.jpegEditor’s Note:  I’m pleased to welcome Eric Hellman to TeleRead.   Eric is a technologist, entrepreneur, scientist and writer. After 10 years doing research at Bell Labs, he founded Openly Informatics, a linking technology business that was acquired by OCLC in 2006. Over the last year, he has been blogging about ebooks, libraries, and technology at Go To Hellman. PB

Online piracy of ebooks has been a persistent worry for book publishers who look at the successes and failures of other media that have moved to digital forms. A surprising number and variety of ebooks are easily availabile on file sharing websites and peer-to-peer networks that use bitTorrent and similar protocols. The possibility that this availability will cut into sales of licensed ebooks and even print books is a scary one for an industry that has had many decades of relative stability. At Digital Book World in January, Brian Napack, President of Macmillan, “delivered a passionate call to arms for publishers to fight piracy in the ebook space or risk permanent damage to the underpinnings of publishing as a commercial enterprise”.

Adding to the ebook piracy hysteria have been studies of the prevalence of ebook piracy produced by Attributor, a company that sells anti-piracy services. I’ve previously written critically about Attributor’s report that purported to find evidence that “Online Book Piracy Costs U.S. Publishers Nearly $3 Billion“.

In their most recent report, Attributor has taken a rather clever approach to the measurement of ebook piracy. Instead of trying to track downloads, Attributor has begun to use Google Trends to gain an understanding of consumer demand for ebooks. Although there are many potential difficulties in using Google for this purpose, Google Trends is a powerful and useful tool for gaining insight into the things that web users around the world are looking for.

Attributor presents their data along with an alarming narrative of growing and pervasive ebook piracy, and points to the iPad as a contributing factor to an increase in demand for pirated ebooks. After playing around with Google Trends for a while, I’ve come to the conclusion that Attributor has narrowly selected data to fit their narrative; taken as a whole, Google Trends data broadly supports a rather different narrative: that the growth of consumer interest in pirated ebooks slowed significantly in 2009 and stopped in early 2010.

To understand how Google Trends informs the debate about the prevalence of ebook piracy, it helps to understand what activity is being measured. Google Trends measures the frequency that search terms are used. A consumer looking for a free copy of a particular work will typically search on the book title, adding terms like “free” or “download” or “pdf” to locate downloadable files. A more sophisticated strategy, one that is quickly learned, is to add the name of a preferred download site. If the user prefers peer-to-peer networks, the word “torrent” can be added to locate “seed” files for the item. The file sharing sites most commonly used for this purpose are currently RapidShare, Megaupload, 4shared, and Hotfile. To use Google trends to measure the demand for a pirated ebook, you give it keywords that reproduce these searches. For example, demand for Stephanie Meyer’s book Breaking Dawn can be assessed with a query such as this one.

To assess the overall state of ebook piracy, I used data from this query. Note that since the search is for ebooks generically, there’s no telling for sure that the ebooks being searched for are really pirated; for the purposes of this study, I assumed that none of the ebooks being searched for are legally available on these sites. Calling them “pirated books” may be inaccurate, but I’ll use that term anyway.

Some features of the data are immediately apparent. First of all, searches for pirated ebooks have increased a great deal over the past 5 years. It’s worth noting however, that the most intense interest measured by Google occurs in India, the Philippines, Indonesia, Vietnam, Malaysia, Singapore, and eastern Europe. Less than half the search volume comes from the US. It’s also easy to see seasonal peaks that obscure the shorter term trends. The peak periods for pirate ebook seeking are the December holidays and the beginning of September, presumably because of the start of school.

piratedemand.pngTo eliminate seasonal variations, I computed the year over prior year growth of pirate ebook search activity. The resulting plot is quite smooth. After a few years of 100% per year growth, 2008 showed a clear slowing of growth. This slowing of growth continued up to the beginning of 2010, and then flat-lined. Since February of 2010, the growth of interest in pirated ebooks has stopped completely.

It should be noted that this stabilization has occurred during a period of strong sales of ebook reader devices, including Kindle, Nook, and the iPad. Indeed, the unveiling of the iPad was coincident with the stabilization of demand for pirate ebooks.

It’s hard to know for sure what’s happening, but one interpretation of these patterns is that a broad increase in consumer-friendly availability of properly licensed ebooks over the last 2 years has squelched the growth of demand for ebooks from illicit sources. In that light, the remaining demand can be interpreted as a sign of poor availability for appropriately priced ebooks on college campuses and in developing countries.

While this data has to be seen as an encouraging sign for the book publishing industry, it’s too soon to know if it will last. It’s entirely possible that too-high prices, cumbersome DRM, or new technologies could reinvigorate the demand for illicitly shared ebook files. For the moment at least, the book publishing industry can exhale.


  1. Nice analysis. I’d argue that because ebooks are becoming more mainstream, and there are more quality and big-name places to acquire correctly licensed ebooks, the demand for pirated ebooks has decreased (or stopped). As Hellman’s article mentions, the decrease in demand started around the same time most of the big-name readers became available and the market heated up. Before that, there wasn’t much legitimate selection. Because ebooks were largely in the realm of the tech-savvy enthusiast at that time, pirate editions could be found.

  2. I believe that the earlier surge was because there was nowhere to buy legit copies much and with the availability now through Amazon/B&N etc access has diminished the piracy. Makes sense to me.
    We have to also remember that readers are not the same kinds of people that music listeners in the days of the music piracy boom. Kids driven as much by the challenge, rebelliousness, ego etc. I still believe however that as the eReader boom reaches the mainstream there will be another surge in piracy driven by regional restrictions, DRM and pricing.

  3. Well, I definitely agree with the fact that since Attributor sells anti-piracy services, they have compromised their own study ON ebook piracy. That’s a no-brainer. If I am to be convinced by THIS study, however, I would like a larger version of that graph labeled “piratedemand.png,” I’m not sure why it’s not a thumbnail to a larger view. Attributor also mentioned that the use of popular sites like Rapidshare and Megaupload are declining in the face of smaller, independent cyberlockers. Commenting on that in relation to your study would help. Lastly, the Stephanie Meyers query assessment example you gave DOES show a significant rise in demand in 2010…regardless of it being – for the most part – outside the US. I would love to hear these issues addressed, as it is all pretty fascinating to me in general. 🙂 Thanks!

  4. Joe- I have a larger view of the graph on the Go To Hellman blog; perhaps Paul can link the thumbnail.

    If there are smaller file locker sites that are rapidly growing, it would be not be hard to add them once they are identified and attract sufficient search volume. I’m not sure how consumers would find these sites. My data includes searches for torrents; Attributor omits these, even though the volume is larger than any of the file locker sites.

    When I look at the Breaking Dawn data, all I see is spikes related to movie and book releases followed by decay of interest. This is just a function of the general interest in Breaking Dawn and says nothing about general trends for ebooks.

  5. All of these studies seem to leave out IRC which is a huge channel for “pirated” books, although I’m not sure how you’d measure it. Many folks start out on the boards that have links to places like Rapidshare (or newer ones like iFile, 2Shared, Zippyshare, etc. which many file sharers are moving to) and then become proficient in torrents and then IRC as ways to find books. Also a lot of the files on sites like Rapid & Mediafire etc. are purposely being named as nonsense words to hide them from being found unless you’ve got a link and the board that host the links are getting better at hiding their sites from search engines.

    The trends link on Breaking Dawn should be an indicator to US/UK/Canadian publishers that Geo Restrictions are a big factor in piracy and that they need to find a way to eliminate them. There are lots of folks in countries where English isn’t the primary language that want access to books in English.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.