9

image I remember very precisely my first day with an e-ink device. Instantly I realized what a difference it makes to read on a screen that looks like paper. But for some reason, the whole experience still felt like reading on a screen, instead of reading a book. It took me a few days to fully understand this impression: typesetting. I’m used to hyphenation, kerning, widows/orphans etc.—in a book. On a screen, the typesetting is usually very limited. While the screen looked like paper, the text looked like something that a screen displayed.

I managed to avoid this problem very quickly, using PDF files created for the device, but it’s still something that no reflowable format, no reading system solved yet.

Why should EPUB add support for hyphenation then? Customers expect the same quality of experience with an e-books than a book. Publishers are very picky with typesetting too. So let’s see how EPUB could add support for hyphenation…

Soft Hyphen

According to the W3C:

“In HTML, there are two types of hyphens: the plain hyphen and the soft hyphen. The plain hyphen should be interpreted by a user agent as just another character. The soft hyphen tells the user agent where a line break can occur.

Those browsers that interpret soft hyphens must observe the following semantics: If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character. For operations such as searching and sorting, the soft hyphen should always be ignored.

In HTML, the plain hyphen is represented by the “-” character (- or -). The soft hyphen is represented by the character entity reference ­ (­ or ­)”

Soft hyphens are already supported with the current OPS specifications for EPUB. I can see a situation where this is useful: you’re using a technical word that might not be available in a hyphenation dictionary, therefore you can manually specify where the word can be hyphenated.

But using soft hyphens in the whole flow of a book ? Bad idea. First of all, you can’t expect all books to be hyphenated this way. It also makes the process of generating a book much more difficult.

The CSS3 way

In the current version of the working draft for the CSS3 Generate Content for Paged Media Module, there’s a section dedicated to hyphenation.

The two most important properties are:

  1. hyphens: used to set how a text is hyphenated. You can disable hyphenation, use the manual setting (certain characters and soft hyphens) or the auto setting that will fully hyphenate the text based on the resources specified.
  2. hyphenate-resource: to list the resources when hyphens is set to auto.

You can also get extra control over the way your text is hyphenated using: hyphenate-before, hyphenate-after, hyphenate-lines & hyphenate-character.

I love the way you can automatically hyphenate the text using a list of resources: it’s very simple to set-up. The same way that we can embed fonts, we could also embed hyphenation dictionaries. The only problem in this case is that we may end up with large files. The reading system could automatically embed a few hyphenation dictionaries for the most common languages. And if you’re using something more exotic (technical words for example), you could also embed the right resource in your file.

In XSL, you can specify the language too. Specifying the language would be compulsory if you rely on the hyphenation dictionaries embedded with the reading system, but also a very powerful tool if it’s used the right way with hyphenate-resource. If your text uses more than a single language, you could specify multiple hyphenate-resource based on the language.

Conclusion

Hyphenation is just a single example of how typesetting support can be improved in EPUB. It is fairly easy to add this sort of support, although it would add some extra processing on the reading system.

Moderator’s note: I heartily agree with Hadrien’s concerns but would be interested in hearing from those who don’t. Why shouldn’t .epub get more typographical capabilities? Genuine bibliophiles love not just well-done hyphenation but also goodies such as kerning. I can enjoy hyphenation when reading .epub files in FBReader, but if there are ways to make hyphenation part of the official spec in a way that improves life for readers and developers alike, then why not? FBReader’s hyphenation is nice but far from perfect. I’m not even sure if Digital Editions, Adobe’s .epub app, offers hyphenation.

On another issue, no, that isn’t an official IDPF logo at the top of Hadrien’s fine article—rather, a suggested image from BookGlutton. What do you think of the possible prototype, though? Got any images of your own? Think the TeleBlog should hold a prototype contest to encourage the IDPF to take official action on a logo? Here’s hoping that IDPF will act on the “Intel Inside”-type idea that I proposed earlier. – D.R.

Technorati Tags: ,,,

 
9