17

Part one in a series exploring the state of e-book publishing today. Today’s installment is one of several by New York editor Roger Sperberg about the publishing’s failure to use XML markup as the base for creating an electronic future for the book industry.

eReadster — eFail!What’s XML for? Perhaps if publishers understood that question we would be farther along the road to e-books — and to whatever the thing is that subsumes e-books into a richer medium without forgoing book-ness.

I was speaking with Jess Lawson of Oxford University Press earlier this week about using XML in book production. The desirability of an XML workflow comes across more clearly, I observe, when it’s called “XML first,” as OUP does. Adding XML markup for web or e-book delivery after a standard birth — inception, editing, production — enables electronic delivery but seems to be worth only about as much trouble as it takes. After-the-fact XML brings little additional benefit.

I remember a slide that Tommie Usdin of Mulberry Technologies showed at an XML conference ten years ago. It stated simply, “Markup is expensive.” And about the same time Jon Bosak of Sun did some back-of-the-envelope calculations that balanced the extra costs of adding markup at about 1.8 uses of the content.

What’s that mean? By Bosak’s rule of thumb, if I were to publish ten books with ten chapters each, the additional cost incurred by structured markup like XML or SGML would be met by the simple re-use of 80 of the 100 chapters — on the web, in advertising, in custom publications, in re-purposed derivative works. If I remember correctly, Jon’s data came from Sun’s own experiences, in which material describing computer subsystems would be used in documentation for many different final products, with some descriptions even making their way into marketing handouts.

Trade publishing doesn’t have so many opportunities for repackaging, but re-use is as simple as utilizing the same source for different editions. So the added cost of XML markup is met if, say, I publish four of my ten texts in hardcover, mass-market and large-print editions. Fifteen years after the internet’s appearance and well into the second coming of e-books, this seems rather crude justification. But fifteen years ago, those three editions likely would have been produced — keyboarded, formatted, proofed — in three entirely separate editorial workflows. In 1994 I was working with Ballantine Books, and even then setting the paperback from the hardcover text files was the exception rather than the rule.

Single-source “P- and E-” publishing appears to drive the publishing industry’s slow turn to XML workflows. Markup is expensive, and the uncertain economics of our electronic future means the sight line for pay-back on that extra expenditure must be short, direct and obvious.

Perhaps this is one reason “electronic” books are scarcely electronic at all, but only scantily draped in the most superficial of markups, our ever-present HTML. With its ready use in even the most rudimentary web-pages, HTML markup must seem like a no-brainer to those publishers venturing into e-books. Who wants to invest millions on markup with no way of assuring its return?

To return to my opening question — Why XML? — we won’t understand the answer until we first realize that the responses publishers most often rely upon are really answers to Why HTML? or Why single-source? Years ago, Bob Stein argued that we couldn’t exploit the electronic side of publishing unless authors understood what that meant (and then he set about building new authoring tools).

Today, I would argue that we can’t exploit E- (yes, “E hyphen” is my abbreviation of e-publishing and fiddle for anyone else’s use of “electronic” as a situational attribute) until editors understand XML as well as English grammar, and regard metadata as valuable as a plug on Oprah. Only then will the structural elements exist in e-books that will make them more valuable than p-books.

This is only the first broadside of many which I will be launch here from Teleread’s ramparts. I also splutter as @eReadster on Twitter.

 
17