3

image One minute I was reading Roger Sperberg’s “Why Do Publisher’s Need XML?” post, and the next time I looked, I was typing like a mad woman.

“Wait!” I thought to myself. “He’s only talking about one facet of publishing, but making it read like this is true for all of publishing! What about all of the other acts of publishing that do need XML?”

Huh? Publishing has facets? Well, yeah. I’ve worked in multiple facets of the publishing community: University Press, Journals, Technical Publications, Textbooks for K-12, Higher Education, and Continuing Education, as well as Enterprise Publishing at some very large companies.

I agree completely with Paul Topping’s comments on the Sperberg article. Accessibility and Process-ability are two huge reasons to not write off XML as a fad or lost cause in the publishing community.

The relevance of markup is a contextual to the line of business doing the publishing, and what the business intends to do with the content in the future.

One thing is certain: if you cannot programmatically get to a piece of content, whether by metadata, or walking a markup tree, the potential to reuse that piece of content or provide value-added any sort of processing is greatly diminished.

In addition to Paul’s comments regarding accessibility and calculations, I would like to point out that other other publishing processes exist that regularly benefit from marked-up content. One example that comes to mind is a pivot table that is not based on numerical data—like the kind of data that is often collected for clinical data reports in large clinical trials.

Another example comes directly from learning content—Test Questions and Answers. Not only can you deliver this content more intelligently with markup, the IMS QTI standard has designed markup to handle automated processing of test question scoring, as well as logical delivery of further test questions based on real time information submitted from users.

But not everyone needs or wants to consider going to this level of detail if the financial side of things does not make sense, or the need to reuse doesn’t exist (which I find hard to believe these days with the number of different ways to publish a single book), or the need for a process adding additional value to content is just not there.

The business process for publishers who work with narrative content, that is fiction, and other “soft” side subjects—basically, anything that is not related to science or math, or content such as certification test preparation guides—does not currently require identifying content in minute detail to get an acceptable end result: a hard bound, paper back, and/or e-book (which, just to reiterate, are products of the now, but not necessarily the sum-total of products that will be available in the future).

The near future ROI for soft-side publishing does not exceed the costs of converting narrative content from authored originals to XML, cleaning up the content so that it can be consistently processed, and building automated workflows required to maintain and process the content in the future.

Unless… Unless you want to do something innovative like automatic composition of content that aggregates relevant content from a single source based on audience (think of the original version of Marley & Me vs. the children’s book Bad Dog, Marley!, or a student/instructor version of an English literature anthology), or the ability to single source products with dramatically different presentations (think of Cesar’s Way Deck: 50 Tips for Training and Understanding Your Dog and Cesar’s Way: The Natural, Everyday Guide to Understanding and Correcting Common Dog Problems; the first is a deck of cards, the second, a bound book. Much of the material could have come from the same source.)

But narrative, soft-side content is only one facet of the publishing community.

Educational publishers are required, by law, to create accessible versions of their products. What has become the de facto standard for accessibility in the U.S.? The DAISY standard, which is, you guessed it, has more than one part based on XML.

The FDA has its own standards in which they want to receive New Drug Application submissions from pharmaceutical companies. The officially approved standards? Yep. XML.

As a good friend and former co-worker once said, “When the federal government stands between your product and the market, you tend to do what they say…”

I could provide additional examples from other industries, but I think you guys get the point. XML is here, and it is useful in all sorts of publishing scenarios – if the business has the requirements to go there. So instead of asking “Why XML?” Lots of people are out there asking “Why not XML?” It’s my experience that people don’t really think about what they can do with XML content until they have some to play with and experiment upon. Experimentation can lead to innovation. Innovation can lead to new ways to realize ROI. Dismissing a technology because it’s not useful in one facet of publishing, or because the content is authored in Word (or on a typewriter – some professors are really old school), or because it can be expensive, does not mean that there isn’t a community out there that can greatly benefit from the technology right now, as is the case with books converted to the DAISY standard to provide accessibility to information. There’s also an innovative community out there that is thinking about how they will come up with new and different approaches to looking at, analyzing and learning from content, and how this innovation can be turned into published products.

By the way: The normative narrative content model in the ePub can be either XHTML or the Daisy DTBook standard. The metadata and processing parts of ePub are either XML-based or XML-related.

Bio: Jean Kaplansky is an avid P- and E- book reader who just happens to also have a history of working in publishing, with publishing tools, or for someone who wants to get something published. Jean is currently a Sr. Consultant for the PTC Arbortext Business Unit. The opinions expressed in this post should in no way be attributed in any way to Jean’s employer.

 
3