March 5, 2014

Raising the Standards for Simplified XML Editing

What would happen to Microsoft Word if its users were required to not only make sense of the complicated XML1 markup which is used to store files in the docx format, but also to put up with error messages proclaiming things like “myprojectplan.docx:67:20:E: document type does not allow element ‘p’ here?” The program would probably be a lot less popular to say the least, and most people would seek alternatives. Yet, technical communicators using XML-based editing tools are expected to put up with that sort of thing all the time— all in the interest of a higher purpose, such as quality improvement and slashed translation costs. But it doesn’t have to be like that.

MS Word, just like DITA2 or Excosoft’s file format, uses XML; and as a user you never see a single tag3. Yet the program has tons of clever functions and does everything it sets out to do. Of course, technical communicators have other needs. Instead of WYSIWYG4, they need separation of content and style. Instead of editing pages, they need to focus on structure and content. Instead of fiddling around with fonts and colors, they need consistent style sheets. They also need robust functions for reusing code and filtering content for various purposes. They need integration with a CMS5, and they need assistance finding out whether they have written the same thing twice. But they do not need XML tags.

The solution is simple. Make a list of all the things technical communicators need to do with their content, and then design a great function for each of those things. Most XML editing tools do this to some extent. You do not have to manually type the XML tags for marking up a word as bold, etc. But it is the next step that will make all the difference in the future of XML editing; that is, the point of realization that XML is really nothing more than a file format, and that it’s the editing tool itself which should lie in center stage, allowing technical communicators to do everything they need to do without fussing with XML at all.

An editing tool should allow technical communicators to work with fundamental building blocks like divisions, tables, lists, and paragraphs; each building block having a clear beginning and end, and each containing other building blocks. You should be able to assign semantic meanings and formatting hints to building blocks. A list could be ordered, or not. A division could be a chapter, or a warning, or something else. Of course these categories would be configurable for the particular content you need to handle. There should be functions for that too. And there should be quality assurance functions that check not only your spelling and grammar, but also the structure of your content. Perhaps part of the quality assurance could be accomplished with a DTD6 or an XML Schema7. Perhaps by some other means.

In all practicality, but somewhere far beyond what the average user would have to care about, all of the building blocks and categories and content of this editing tool would probably be automatically translated into XML. But hey, who needs to care about file formats?

[1] XML (Extensible Markup Language): used for encoding documents in a format that is readable by both humans and machines.
[2] DITA (Darwin Information Typing Architecture): a common XML standard for authoring a document.
[3] Tag/XML Tag: a markup character indicating the beginning or end of an XML element.
[4] WYSIWYG (What you see is what you get): the document looks the same, on screen during editing, as it will on paper.
[5] CMS (Content Management System): a computer program through which content can be published or edited.
[6] DTD (Document Type Definition) a standard technology which defines the rules that XML documents must adhere to.
[7] XML Schema: Like a DTD, an XML Schema is a standard technology which defines the rules that XML documents must adhere to.

About the author

Joakim Ström

With over 15 years dedicated to software development, Joakim expertly drives internal improvements and often hands-on innovation here at Excosoft.

Post Comment


  • Excosoft
  • Information Design
  • Info Tech Trends

Latest posts

More from Blog & News