The ABCs of e-book format conversion: Easy Calibre tips for the Kindle, Sony and Nook

imageWelcome to TeleRead’s newest contributor, John Schember, a member of the team behind the wonderful Calibre program for managing e-book collections. His bio appears at the end. Calibre is much improved, and I myself am in the middle of Calibre-izing my own collection. Screenshot to the right is from me---don’t blame John for any of the reading choices. – D.R.

imageE-book readers are becoming more and more common. Two of the most popular today are the Amazon Kindle and the Sony Reader line.

Unfortunately the two different brands don’t read the same kinds of e-books. This mess is like the one in the music world where you might find such formats as WMA, MP3, and AAC. In e-books, the same confusion exists---the Tower of eBabel, as some call it.

image If you are only buying from the store designed for your reader---for example, Amazon's Kindle Store or Sony’s Reader Store---you don't need to worry about any of this.

But there are a very good reasons why you should know about the major formats, what you reader supports and how to convert between formats.

Many Web sites offer legal and often free books. Everything from public domain books to well known and less known authors. Also, you can shop for the best prices at a number of small independent e-book stores.

Often you can download these e-books in a variety of formats, but you won’t always find them in the format your e-book reader supports. Here is where conversion comes in. There is a very good chance that you will be able to take an e-book and convert it to a format your reader supports, at least if the book doesn’t use Digital Rights Management, anti-copying technology.

In the rest of this article I will focus on (1) the Kindle which supports the Mobipocket format, aka MOBI, and (2) the Sony Reader line (the PRS machines like the PRS-600), which supports the EPUB format. Conversely, the Kindle does not support the EPUB format, and the Sony line does not support the Mobipocket format. The Nook, the new reader from Barnes and Noble, can read EPUB, the same format as the Sony although there are now some catches with DRMed books.

Why are there different e-book formats?

Just why do so many different formats exist? Advances in technology? In fact, that’s a major reason. Just like the transition from VHS to DVD and now to Blu-Ray, older formats which were created to solve the problems faced at that time are replaced with newer formats that better meet need of today. A great example of this is the old books people read back in the 90’s on their PDAs. Those devices were very limited in what they could display. E-readers today are much more advanced. They can display large images, and handle advanced formatting. These newer devices needed formats that can provide these features.

Another major reason is exclusivity. Many vendors like to have and control their own formats so they are not dependent on outside companies. They also have the benefit of being able to license their format for use by others. This also allows them to lock users into their platform. E-books, being relatively new, are undergoing the same growing pains that Betamax and VHS or HD-DVD and Blu-Ray went though. The EPUB format, from the International Digital Publishing Forum, is an industry standard intended to reduce these problems.

Tools for conversion

Many easy-to-use tools exist for converting e-books. For Kindle users, the Mobipocket Desktop is a good choice. Amazon also provides a conversion service that allow you to email them e-books which they will convert and send directly to your Kindle.

Not to leave Sony users out in the cold, there is a more general tool that can convert between a large number of formats. Calibre supports the Kinde, the Sony PRS line, the Nook and a large number of other devices. It is is a full e-book management application that can organize your e-book library, handle automated news downloads from a number of sources, and convert between a large number of e-book formats. It is a one stop, all in one tool.

In addition to the above tools there are a number of harder to use, specialized programs. I’m not going to touch on those but they do exist.

Conversion basics

The first thing you need to do is find out what formats your e-book reader supports. The Kindle supports AZW, MOBI, PRC, AZW1, TPZ, TXT. The PRS line from Sony supports EPUB, LRF, LRX, RTF, PDF, TXT. Don’t let this scare or confuse you; all of the major e-book readers support multiple formats. Even with this jumble of letters, you only need to worry about the preferred format for the reader. This special format is the one that gives the best formatting. As I mentioned earlier for the Kindle, you really only need to worry about Mobipocket (MOBI), and for the Sony Reader line (PRS) you only need to worry about EPUB (same for the Nook). However, it is a good idea to be aware of all of the supported formats because it wouldn’t make sense to convert an AZW to MOBI for reading on your Kindle because the Kindle can already read AZW books. Conversion is only necessary to fill in the gaps. Meaning you want to read an EPUB file on your Kindle so you convert it to MOBI.

Downloading Calibre

You can download Calibre here with your Firefox, Internet Explorer or other browser. Versions exist for Windows, OS X and Linux. Calibre has an easy-to-use Welcome Wizard to help new comers get up to speed. Just answer the Wizard’s questions.

Using Calibre to convert

imageimageUsing Calibre to convert is very easy. Plug in your e-book reader. Open Calibre and click the “Add books” button on the top left. Select your book. Click open. Select your book in the library list. By now Calibe should have detected your e-book reader. Click “Send to device” in middle of the top toolbar. Calibre is smart enough to know if the book is in a format supported by your reader. If it’s not, it will ask you if you want to auto convert it. Say yes, and it will take care of the conversion and put the book on your reader. That’s all there is to it. Doing it is easier than it sounds because all you need to do is select the book you want on your device and clicking “Send to device.” Calibre worries about the formats and converting for you.

More robust conversion

image Auto conversion is the easiest way to go and in most cases will be all you need to do. However, there are a few options that allow control over conversion process. Before starting the conversion process, it is a good idea to verify and correct the metadata for the book. The metadata is information about the book such as title, author, series and what not. After adding a book click the “Edit meta information” button. Fill in the title and author or the ISBN (it is better to use the ISBN for the paper or hard back version than the e-book’s ISBN). Then click “Fetch metadata from server”. This will pull in all kinds of information about the book. If there is no convert image next to the metadata entry or if it is a generic image it is a good idea to click “Download cover”.

image Now that the metadata is all correct, click the “Convert E-books” button. This screen looks very complicated but realize that the majority of options here are best left alone. Most of the options only need to be changed on a are per book and in special cases basis. There is one option that is very important and may need to be changed. At the top right there is a drop down for “Output format.” This control what format the conversion will result in. Kindle owners will want to select MOBI and Sony and Nook owners will want to use EPUB.

imageimage In the conversion dialog there are a few things to check before clicking “OK” which begins the conversion. The first thing you need to do is double check the metadata and make changes if necessary. Click on “Look & Feel” on the left and side. The “Remove spacing between paragraphs” option is a popular option. It will cause paragraphs to be formatted with a indent at the beginning instead of separating them with a blank line. Basically it makes the result look more like a printed book than the default which looks more like a web page.

image Next click “Page Setup” which is the first item under “Look & Feel”. When you installed Calibre if you didn’t select your device in the welcome wizard you should select it here. The input and output profiles provide specialized turning for the specified device. Be aware that not all formats are affected by the profile.

image That’s it for the basic conversion options. Every option in the conversion dialog has a description of what it does when you put the mouse over it. Look though the options and play with them to produce output that suits your taste.

Click “OK”, the dialog will close and the conversion will begin. Look at the bottom right of the screen at the “Jobs” indicator. When it spins that means Calibre is working. Clicking it will show what is being worked on.

When the conversion is finished the jobs count will drop to zero and the indicator will stop spinning (unless there are more jobs pending). Once the conversion is finished click the downward facing arrow to the right of the “Send to device” button. Select the one of the “Send specific format” options (main memory is usually the best choice). A dialog will appear asking you which format you want to send. Select the format you just converted the book to.
Limitations of conversion

Limitations of conversion

Converting between e-book formats does have some limitations. One limitation using a tool like Calibre is the inability to edit the book before conversion. Calibre simply moves the content and formatting from one format to another. It is not a editing tool. If there are typos in the text, you will need to use a dedicated editing tool such as Sigil or Book Designer.

Another issue to that often arises is missing formatting. Not all e-book formats support the same formatting. It can be lost when converting to a format that supports limited or no formatting. Basics like bold and italics will be preserved in most cases but complex page layout may not be. MOBI and EPUB both support complex formatting so you won’t have to worry about this when using these formats.

Finally converting will only shift what the input provides into another format. It will not add anything that was not already in the input to the output. So if the input is poorly formatted, the output will be too.

PDBs they are not all created equal

This is of particular importance to Barnes and Noble Nook owners. Barnes and Noble sells books in the PDB format (along with EPUB) and as you might expect it is supported by the Nook.

PDB is not really an e-book format. It is a container for e-book formats. Think of it like a zip file. You put other files into a zip file so you only have to worry about having one file instead of many. That is what PDB essentially does for e-books. There are 28 e-book formats that can be put into the PDB container that I know of.

An e-book reader like the Nook which supports PDB does not support all the possible formats that can be within a PDB file. The two most common formats found in PDB files are PalmDoc (also known as textread and Aportis) and eReader. PalmDoc does not support any formatting or images. eReader supports basic formatting and 8-bit images. The PDB files sold by Barnes and Noble are in the eReader format.

DRM the bane of conversion

DRM, as noted, stands for Digital Rights Management.

Let’s think about physical books for a moment. With a physical book, you can lend, and sell that book. But when you do either, you have to go without the book. With e-books, that is not the case. E-books are just files on the computer and they can be copied any number of times and given away any number of times. DRM is designed and was created to prevent unlimited copying of an electronic file (although some e-book users would also note that it is a handy way for companies to try to lock them into specific brands).

DRM affords different books various rights as determined by the publisher and seller. Some can be read on more than one device. Some will allow for partial copying and printing. Simply put, DRM restricts what you can do with an e-book.

Any e-book with DRM cannot be converted to a different format. This is because conversion itself would require the removal of the DRM. Not all e-book formats support DRM and different e-book formats support different sets of privileges granted by the DRM. There is no way to move the DRM with the content when converting; thus DRM prevents conversion.

You might be tempted to look for some way to remove DRM from e-books in order to facilitate conversion. A word of warning about doing this: In the USA there is a law known as the Digital Millennium Copyright Act (DMCA). This law makes it illegal to circumvent a copy protection system (DRM). It also makes it illegal to produce tools, distribute tools, and aid in circumvention. Not everyone lives in the USA, but many countries have similar laws. Check your local laws and realize that even though you my may only want to read an EPUB book you’ve legally purchased on your Kindle, it may not be legal to do so. If you’re don’t like this silliness—and I don’t—then speak up to whoever in your country makes the relevant laws.

John’s bio: “I have a bachelors in Sociology with a minor in Anthropology from Arizona State University. I contribute to the Calibre e-book application and maintain a number of input and output formats. I also actively work on the device infrastructure and anything else that catches my interest. I have a few e-book readers but prefer the Cybook Gen 3. I mostly read fantasy and some science fiction. But I do enjoy the classics.”

126 Comments on The ABCs of e-book format conversion: Easy Calibre tips for the Kindle, Sony and Nook

  1. I think it’s worth pointing out that Mobi, PRC, and AZW are the same format. If a reader reads one of those format, it can read the others. It’s the same case with LRF/LRX and AZW1/TPZ. Or maybe I’m mistaken…?

  2. Frode Aleksandersen // January 3, 2010 at 3:36 pm //

    Another thing to note is that the Sony readers don’t handle page numbers well in EPUB. LRF is still a better format for reading on the Sony devices. EPUB is probably better for archival purposes though, in case you switch to a different device that supports it and don’t want to go another round of format conversion.

    The article is also not quite correct about the legalities of DRM removal, as the following exemptions apply:

    – if the DRM prevents the text to speech function
    – if the ebook in question is only available from suppliers in a format that is not compatible with your reader.

    http://en.wikipedia.org/wiki/Digital_Millenium_Copyright_Act#Anti-circumvention_exemptions

  3. Spider, you are partly right. In the case of the mainstream e-book readers (Kindle, Sony, Nook) it is okay to assume that if it supports one it will support the other related formats. However, there are technical between those formats making them not 100% interchangeable.

    MOBI and AZW are indeed the same format. AZW1 and TPZ are also the same format. PRC is a generic extension that could be MOBI, TPZ or a few other formats. LRF and LRX are similar. The LRX extension signifies that it is an LRF file with DRM attached.

    On the MOBI and AZW front, AZW is only sold by Amazon. The DRM system used by both MOBI and AZW locks the file so that it’s readable only on a particular device. Amazon doesn’t allow AZW files to be read on e-book readers other than the Kindle. Meaning if you have a reader that can read MOBI that doesn’t mean it can read AZW files from Amazon. Technically it’s possible but due to restrictions placed by Amazon there is no easy or legal way to do so.

    Also, it is possible for a reader to support LRF but not LRX because supporting the format does not necessitate supporting the additional DRM for that format.

  4. “- if the DRM prevents the text to speech function”

    I believe this only applies if you are blind.

    “- if the ebook in question is only available from suppliers in a format that is not compatible with your reader.”

    Nowhere on the linked wikipedia page does it say this. As far as I’m aware there is no exception for this case.

  5. Hi, Spider and Frode. Thanks for your comments.

    Spider: I’ll let John address your remarks.

    Frode: I’ll welcome your elaborating on the Sony reader page number issue. I do think that ease of switching to other devices is a major reason to favor ePub.

    Here in the United States at least, people are stuck with DRM in ways that John described. The TTS argument won’t work with people without disabilities. Hard to believe. When it comes to DRM, the U.S. is among the most backwards of countries, and our ways must be perfectly inscrutable to many sensible people elsewhere.

    Happy holidays,
    David

  6. I meant that the formats are the same sans DRM. I’ve never had a PRC file that wasn’t mobi so I don’t know about the variations of this extension.

    Mobi and ePub DRM is easy to remove and there has never been a case in the United States of someone getting fined for removing DRM for personal use. In such a case, it would be very hard to prove, anyway and is doubtful to many that such a case would rule against an individual for removing DRM. It may ultimately be seen as unconstitutional. A lot of laws are challenged and overturned or redefined in the US. I think it’s a large exaggeration to say there is much danger to the individual to strip DRM. I recommended everyone learn how to do it.

  7. Frode Aleksandersen // January 3, 2010 at 4:28 pm //

    The format conversion exemption is listed in the source for the text at wikipedia:

    (section 4) http://www.copyright.gov/1201/2006/index.html

    Note also that it does not say anything about being blind or disabilities. While I agree that is certainly the intent, it should open it up for anyone needing the functionality for whatever reason. An exemption wouldn’t be much good for anything if you start trying to add unwritten limitations to it.


    The page number problem with Sony readers (or at least with the 505 – haven’t tried the newer models) is that EPUB has a fixed number of pages. When the text is reflowed on the reader, it still maintains those page numbers, regardless of formatting or font size. The result is that when you look at the page numbers at the bottom you may see something like page “4-5 of 576″ and then when you turn the page you get “5 of 576″, and so on. I like being able to tell how much I have left of a book I’m reading, but without proper page numbers that can become rather difficult.

  8. Frode Aleksandersen // January 3, 2010 at 4:33 pm //

    One thing I forgot to mention – those exemptions are only valid for 3 years at a time, and since they were put down on paper in 2006, they’re up for review. The new list has to my knowledge been published yet, but there may be changes as a result.

  9. Frode Aleksandersen, you actually need to read the exception as it is stated in the documents linked with the summary. It specifically states that the purpose for the exception is for use by the blind.

    Page 37 in regard to section 4: “This read-aloud functionality of the ebook reader and the text-to-speech (TTS) or text-to-braille (TTB) functions of the screen reader software create tremendous potential for accessibility to works that might otherwise be unavailable to the blind and visually impaired.”

    Spider, The Kinde for PC application for instance downloads all e-books with the PRC extension regardless of them being MOBI or TPZ.

    Even with criticism of the DMCA and lack of cases I cannot morally justify telling someone that they should break the law or how. That information is available for those who want to find it. Even if I don’t agree with aspects of that law.

  10. I think that fair use actually covers conversion of these for personal use/formatting. I personally refuse to purchase any ebooks that cannot be stripped of their DRM, but thankfully Sony has switched to epub making this very easy to do.

    I’ve found that the best formatting comes from HTML ebook files, which have the best margins use the largest amount of screenspace once calibre converts them.

  11. Frode Aleksandersen // January 3, 2010 at 8:56 pm //

    Ahh, I’d only read the summary. Thanks for the update on that, and sorry for any misunderstanding I may have caused.

  12. Justin, While it might be legal under fair use the same is not the case under the DMCA. Fair use only applies to copyright. The DMCA is a separate law and while having copyright in the name is not an extension of copyright. The DMCA adds additional restriction on top of what is granted by copyright as a separate law. You can be completely within the guidelines of fair use and not in violation of copyright while still be violating the DMCA.

  13. Converting eBooks between formats can cause some layout information to be lost. In general this isn’t a problem when moving from an older (simpler) format to newer more flexible formats (ePub for instance). If you are considering a mass conversion, do run some test books through first and compare the results for items like bold, italics, and other simple features! Calibre does a fine job of conversion on the whole, but some of the other tools (desktop Stanza I am looking at you) are very basic and don’t do a good job.

  14. My friend has a Sony reader and she wants a ebook sold at Barnes and Noble… (NOOK) can she just buy the book dl it to her pc and send it to her reader??

  15. I just purchase an iPad to load the many computer manuals I have in PDF. I converted to epub using iPhone setting for device. I was successful in loading the converted file to my iPad but the paging is all wrong. it was wrapping because the page width was formatted for iPhone. How can I adjust the page width so that it will output in the right width?

  16. For some strange reason known only to Barnes and Noble, it looks like ebooks purchase at their website and downloaded directly to a Nook are usually in the ePub format, and they use the B&N social engineering DRM (key is name and credit card number), but if they are downloaded from their website to a computer, it is usually downloaded in the eReader format. Currently, only the Nook is able to read the B&N style DRM ePub format. Supposedly Adobe will be releasing updates soon so their Digital Editions software and the Mobile reader for other readers will be to read the B&N style DRM. When this occurs, it will be up to Sony to decide whether they will push out an update for existing readers or only support it in new readers.

    Of course, if your friend is computer savvy, and willing to install python and the scripts to strip DRM and convert it from Ereader format to ePub, even though it may be illegal in your friend’s country to do so, there is that option as well.

  17. Dauphine,

    Converting PDF manuals to an eBook reader format is possible but painful, particularly if there are a lot of figures and illustrations. A much easier solution is to download “GoodReader” ($0.99 and currently the #4 top paid app for the iPad). This is a wonderful PDF reader and provides many ways to get PDF’s onto your iPad.

  18. I have a general question for the Calibre experts. I’ve used Calibre to convert a bunch of .lit files to .epub format for the iPad, but the resulting files don’t have any table of contents info. Is there an easy way to fix this?

  19. I recently bought my first ebok reader a COOL ER reader and downloaded my first ebook which came as a DRM protected adobe epub. However when I loaded it on to my reader and put the text size up to max (because my sight is very poor and I am registered as blind)the page had such wide margins that only a small square in the middle of the screen was shown with about 8 words in it.

    I made a .doc file of a story I’d written myself into a simple .txt file and that filled the whole sceen and was very easy to read so I know it’s the file formatting and not my reader that is the problem.

    Does this mean that every time I buy a new book I’ll have to go through the complicated looking procedure of stripping the DRM off just to read it in a large font? That being the only reason I bougt an ereader in the first place.

    I can’t aford to buy a reader with TTS yet as there aren’t any here in the UK so I’d have to pay to import.

  20. @Kate: each EPUB file specifies what the default font size is and how much margin to leave. So both will vary from one e-book to another.

    As for Text-To-Speech, as far as I know the Kindle is available in the UK.

  21. it is but only if imported from the states and when you add the extra tax’s it’s ridiculously expensive for what you get and amazon turns off TTS on a lot of it’s books through the DRM plus if anything goes wrong with it it’d have to go back to the states. To be honest Amazon have really taken the piss with the kindles as far as anybody living outside the US is concerned.

    I am hopefull that maybe other e-books might have better formatting if it does differe from book to book, so that’s good to know thanks.

    The Asus large screen eReader is what I’m rally looking forward to but no money and no release date yet so I’ll have to be patient.

  22. I recently got a Nook and I have some older Baen CD rom libraries that only have these formats on them .doc, .lit, .prc, .rb, .rtf and of course many html pages.
    DRM is not an issue with these disk.

    What I want to know is which format would be the best to convert to epub using Calibre? Also there is a large variation in file size with those formats for the same book would that also have a impact on the conversion?

    example: .doc = 4.1 mb, .lit = 1.5 mb .prc = 1.1 mb and .rtf = 4.9 mb

  23. The large variation in size is because some formats are compressed and others are not.

    From the Calibre FAQ:
    What are the best source formats to convert?¶

    In order of decreasing preference: LIT, MOBI, EPUB, HTML, PRC, RTF, PDB, TXT, PDF

  24. Lynn Caprarelli // June 11, 2010 at 6:48 pm //

    please help – i downloaded calibre but it doesn’t recognize my books. they are for my reb 1200 and are in a folder with a .res extension on the folder. it appears the data files themselves have a .frk extension. i could zip it up and send it you. i am getting a kindle today and i am hoping not to have to re-buy all my books. thank you so much. i have about 25 books to convert.

  25. @Lynn Caprarelli:

    It sounds like your e-books have been “exploded” out of the original .IMP files into folders containing their component parts.

    See if you can find the .IMP files. That’s what Calibre needs to work with.

    I don’t know if you can easily put the pieces back together into .IMP files. You might check here: http://wiki.mobileread.com/wiki/IMP_software

1 2 3 6

Leave a comment

Your email address will not be published.

*



wordpress analytics