Images

This is from the Amazon Daily Post.  I found it so interesting that I’m posting the whole thing here:

We’ve seen a lot of interest from customers about our new real page numbers feature for latest generation Kindles and what makes them “real”, so we wanted to tell you a little more about this feature and how we did it.

An e-book, like a print book, is at its core a stream of text. In a print book, this stream is broken up by the size of the pages on which it is printed. Number these pages and you have a way of referencing any point in the book.  The text on page 53, for example, is always the same for every book of the same print edition.   But in an e-book, what looks like a “page” is a display, and the amount of text displayed depends on the font size that you as a customer choose, as well as other options you set yourself such as portrait or landscape mode, or which Kindle or free Kindle app you read with. 

We wanted to be able to display real page numbers that have value and are useful for those who need to cite a specific passage in a book for class, follow along with their friend in a book club, or simply point a friend to a favorite part of the book.  Adding “real” page numbers means we had to find a way to match specific text in a Kindle book to the corresponding text in a print book and identify the correct page number to display. 

With our massive selection and knowledge of print books, we were excited to be in a position to help solve this problem.  We had to invent an entirely new way to match the streams of text in a print book to the streams of text in a Kindle book, and assign page numbers in Kindle books. There are hundreds of thousands of Kindle books (and growing every day), so to handle a job of this size, we turned to our Amazon Web Services computing fabric. We created algorithms to match the text of print books to Kindle books and organized all of this in the cloud, using our own AWS platform.  The results of this work are stored in Amazon’s Simple Storage Service, where we track the complete history of every page matching file we’ve produced.   We even found a way to deliver page numbers to books that customers had already purchased – without altering those books in any way, so customers’ highlights, notes, and reading location are preserved exactly as they were.

Some other e-bookstores have added virtual “page numbers” to e-books, but we’ve found that these approaches can be confusing and often inconsistent –  they don’t map to the page numbers in physical books, and in some cases they don’t account for title pages, blank pages, and other nuances that we see in print books. We’ve already received a lot of great feedback from customers who like our approach.  Real page numbers are already available in tens of thousands of our most popular Kindle books, including the top 100 bestselling books in the Kindle Store that have matching print editions, and we’re adding page numbers in more Kindle books every day.  We want you to lose yourself in the reading, so page numbers are only displayed when you push the menu button.

We’re excited to hear what you think of our real page numbers.  Please let us know.

5 COMMENTS

  1. This account is interesting, and I understand the appeal of a stable page number that doesn’t vary based on font. However, I had to laugh at the criticism of “other bookstores” for creating virtual page numbers that don’t match “the” print version. Many books, especially books assigned for school, are published in numerous versions, the pages of which don’t match each other.

    If Amazon is matching kindle pages to print pages, will it tell kindle readers which print edition is being matched? If not, then I’m not sure what long-term value the fancy page-matching algorithm has. In any event, it’s much easier to find a specific passage in an ebook than in a pbook–just use the search function.

  2. Beth: The ISBN of the print book used is in the ebook’s metadata.

    I don’t mind the virtual pages that Adobe uses by default for its ePubs, but Amazon’s approach is undoubtedly better.

    Not all Kindle ebooks have real page numbers yet, but Calibre can add virtual page numbers, similar to those used by Adobe, to any MOBI that it transfers to a Kindle.

    It is possible to include actual page numbers in an Adobe ePub, but no publisher does this so far as I know (most have trouble producing a table of contents). This is where Amazon’s existing scans of physical books, becomes so valuable. No one else, except Google, has the capability to do what Amazon is doing.

  3. Two words: Paragraph numbering. Easier to do, provides more information, and it doesn’t change from one edition to another. Scholars of classical works and legal publishers have been doing it for decades. Give it a Show/Hide option and all your problems are solved.

  4. Speaking as a grad student doing as much reading as possible using ebooks, It drives me nuts that actual page numbers are not present in Amazon’s books (or at least my books so far.) I think the statement about wanting people to get lost in the book is a thinly veiled attempt to cover a real issue for many many people – certainly anyone in academia. When assigned readings ae given in page number blocks, what am I supposed to do without a real book present, short of hoping there’s a preview a avalable at Amazon? And what if none are available?

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.