24

babel

ePub is the magic that will rescue us from the crumbling Tower of eBabel and give us e-books that Just Work.

Or not.

Here is the experience of a simple-minded publisher who believed what he was told about ePub. Perhaps there are some morals to be drawn. But if I’m not simple-minded, just simple— please correct me gently!

The book I’m formatting as an e-book is a collection of short stories. Here’s a link to one of them, so you can follow my reasoning if you want.

The formatting of these stories has two major complexities:

  1. One or two paragraphs have extra space before them, to indicate the start of a new section.
  2. There is one limerick, which has to be set as verse.

I’ve been checking my ePub files in Adobe Digital Editions (as a proxy for the Sony Reader), on O’Reilly’s Bookworm web site, and in Stanza on my iPod Touch. The actual XML was also viewed and checked in a standard web browser. I’m sure there are many other possibilities, but you have to stop somewhere. I built the files with Dreamweaver, so as to have full control of all the XML and CSS.

Space before paragraphs

Everyone knows enough CSS to know that the ‘margin-top’ attribute will control the space before a paragraph. Except that in Bookworm it doesn’t, because Bookworm ignores all the CSS in the ePub file (do View > Source if you don’t believe me). And Stanza takes over all the margin settings for its own purposes. Result: a short story that drones on, paragraph after paragraph, with no refreshing pauses to give it rhythm.

For my next attempt, I used the ‘padding-top’ attribute. This still fails in Bookworm, of course, but Stanza doesn’t seem to mess it up. So, victory!

Verse

There was an old man of Zermatt
Who was really exceedingly fat.
···Because we were thinner
···We had him for dinner.
Now what could be nicer than that?

Now what could be easier than that, in typesetting terms? A handful of paragraph styles controlling spacing and indentation, and there you are.

Except: Bookworm ignores the CSS, so we’ll get huge spacing, and Stanza has its own ideas about margins and won’t listen to our CSS, so we’ll get huge spacing. In both cases, a mess.

The cure is to put ‘br’ tags between each line. But even that is not enough, because some readers (such as Adobe) will indent the first line of a paragraph, which ruins the verse spacing. So we have to put a ‘br’ tag in front of the first line as well — which means that there’s an ugly blank space before the whole poem.

So you can’t set verse in ePub without knowing what software is going to be reading it.

Typesetting quality

People who set type for a living spend much of their time slaughtering widows and orphans. Adobe Digital Editions manages to eliminate them, though not perhaps in the most elegant way. Stanza, on the other hand, is a one-man benevolent institution. The last line of a paragraph is frequently left dangling at the top of a page; the first line of a paragraph is frequently stuck at the bottom of a page; and it has a love of hyphenating the last word on a page, even if that leaves only 4½ words of the paragraph floating absurdly at the top of the next page.

Yes, hyphenation… I conduct a running war against malicious publishers who hyphenate “wouldn’t” before the ‘n’, but Stanza goes one better than them. In the sentence ‘And how is the butcher’s meat, Valerie dear?’ Stanza manages to put a hyphen after the question mark and before the closing quote. It looks novel.

The moral of the story

The biggest complaint about the content of e-books today is that the quality is so low. This is not just an aesthete’s whinge. Beautiful text is readable text. It is our duty to our readers to do the best we can in e-books, just as we do in print. But despite all the promise shown by ePub, it fails, in practice, to provide the consistency that we need.

You will say that this is a software question; but I say that there is no software involved. No-one really uses Adobe Digital Editions: they use the Sony Reader or another machine. No-one really uses Stanza: they use the iPhone or the iPod Touch. These things are appliances, not computers. Users have no concept that there is any software there at all — do you know what software powers your DVD player? — and so they have no incentive to put pressure on the software suppliers to make things better.

Even if pressure could be brought to bear, it may already be too late. Suppose you are responsible for maintaining Stanza, and you read this post. What will you do? Correct Stanza to be consistent with Adobe? Surely not. Because that will mean that every existing e-book whose author has tweaked it to work will Stanza will instantly look broken. The losers in any change will notice at once, and make a lot of noise; the winners will not even notice they’ve won.

Evolutionary biology has the concept of a speciation event: when two populations find that it is to their benefit for interbreeding to be impossible. Has this already happened within ePub?

Is it already too late?

 
24